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HETEROLOGOUS EXPRESSION OF NEISSERIAL PROTEINS 

All documents cited herein are incorporated by reference in their entirety. 

TECHNICAL FIELD 

This invention is in the field of protein expression. In particular, it relates to the heterologous 
5 expression of proteins from Neisseria (e.g. N gonorrhoeae or, preferably, N. meningitidis). 

BACKGROUND ART 

International patent applications W099/24578, W099/36544, WO99/57280 and 
WO00/22430 disclose proteins from Neisseria meningitidis and Neisseria gonorrhoeae. 
These proteins are typically described as being expressed in E.coli (i.e. heterologous 
10 expression) as either N-terminal GST-fusions or C-terminal His-tag fusions, although other 
expression systems, including expression in native Neisseria, are also disclosed. 

It is an object of the present invention to provide alternative and improved approaches for 
the heterologous expression of these proteins. These approaches will typically affect the 
level of expression, the ease of purification, the cellular localisation of expression, and/or the 
1 5 immunological properties of the expressed protein. 

DISCLOSURE OF THE INVENTION 
Nomenclature herein 

The 2166 protein sequences disclosed in W099/24578, W099/36544 and WO99/57280 are 
referred to herein by the following SEQ# numbers: 



Application 


Protein sequences 


SEQ# herein 


W099/24578 


Even SEQ IDs 2-892 


SEQ#s 1-446 


W099/36544 


Even SEQ IDs 2-90 


SEQ#s 447-491 


WO99/57280 


Even SEQ IDs 2-3020 
Even SEQ IDs 3040-3114 
SEQ IDs 3115-3241 


SEQ#s 492-2001 
SEQ#s 2002-2039 
SEQ#s 2040-2166 



20 In addition to this SEQ# numbering, the naming conventions used in W099/24578, 
W099/36544 and WO99/57280 are also used (e.g. 'ORF4', 'ORF40', 'ORF40-1' etc. as 
used in W099/24578 and W099/36544; 'm919\ 'g919' and 'a919' etc. as used in 
WO99/57280). 
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The 2160 proteins NMB0001 to NMB2160 from Tettelin et al. [Science (2000) 287:1809- 
1815] are referred to herein as SEQ#s 2167-4326 [see also WO00/66791]. 

The term 'protein of the invention' as used herein refers to a protein comprising: 

(a) one of sequences SEQ#s 1-4326; or 

(b) a sequence having sequence identity to one of SEQ#s 1-4326; or 

(c) a fragment of one of SEQ#s 1-4326. 

The degree of 'sequence identity' referred to in (b) is preferably greater than 50% {eg. 60%, 
70%, 80%, 90%, 95%, 99% or more). This includes mutants and allelic variants [e.g. see 
WOOO/66741]. Identity is preferably determined by the Smith-Waterman homology search 
algorithm as implemented in the MPSRCH program (Oxford Molecular), using an affine gap 
search with parameters gap open penalty=12 and gap extension penalty=l. Typically, 50% 
identity or more between two proteins is considered to be an indication of functional 
equivalence. 

The 'fragment' referred to in (c) should comprise at least n consecutive amino acids from : 
one of SEO#s 1-4326 and, depending on the particular sequence, n is 7 or more {eg. 8, 10, : 
12, 14, 16, 18, 20, 25, 30, 35, 40, 50, 60, 70, 80, 90, 100 or more). Preferably the fragment: 
comprises an epitope from one of SEQ#s 1-4326. Preferred fragments are those disclosed in 
WO00/71574 and WOO 1/043 16. 

Preferred proteins of the invention are found in N. meningitidis serogroup B. 

Preferred proteins for use according to the invention are those of serogroup B N. meningitidis 
strain 2996 or strain 394/98 (a New Zealand strain). Unless otherwise stated, proteins 
mentioned herein are from N. meningitidis strain 2996. It will be appreciated, however, that 
the invention is not in general limited by strain. References to a particular protein {e.g. '287', 
'919' etc.) may be taken to include that protein from any strain. 

Non-fusion expression 

In a first approach to heterologous expression, no fusion partner is used, and the native 
leader peptide (if present) is used. This will typically prevent any 'interference' from fusion 
partners and may alter cellular localisation and/or post-translational modification and/or 
folding in the heterologous host. 
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Thus the invention provides a method for the heterologous expression of a protein of the 
invention, in which (a) no fusion partner is used, and (b) the protein's native leader peptide 
(if present) is used. 

The method will typically involve the step of preparing an vector for expressing a protein of 
5 the invention, such that the first expressed amino acid is the first amino acid (methionine) of 
said protein, and last expressed amino acid is the last amino acid of said protein (i.e. the 
codon preceding the native STOP codon). 

This approach is preferably used for the expression of the following proteins using the native 
leader peptide: 111, 149, 206, 225-1, 235, 247-1, 274, 283, 286, 292, 401, 406, 502-1, 503, 
10 519-1, 525-1, 552, 556, 557, 570, 576-1, 580, 583, 664, 759, 907, 913, 920-1, 936-1, 953, 
961, 983, 989, Orf4, Orf7-l, Orf9-l, Orf23, Orf25, Orf37, Orf38, Orf40, CM40.1, Orf40.2, 
Orf72-l, Orf76-l, Orf85-2, Orf91, Orf97-l, Orfll9, Orf 143.1, NMB0109 and NMB2050. 
The suffix 'L' used herein in the name of a protein indicates expression in this manner using 
the native leader peptide. 

15 Proteins which are preferably expressed using this approach using no fusion partner and 
which have no native leader peptide include: 008, 105, 117-1, 121-1, 122-1, 128-1, 148, 
216, 243, 308, 593, 652, 726, 926, 982, Orf83-l and Orfl43-l. 

Advantageously, it is used for the expression of ORF25 or ORF40, resulting in a protein 
which induces better anti-bactericidal antibodies than GST- or His-fusions. 

20 This approach is particularly suited for expressing lipoproteins. 
Leader-peptide substitution 

In a second approach to heterologous expression, the native leader peptide of a protein of the 
invention is replaced by that of a different protein. In addition, it is preferred that no fusion 
partner is used. Whilst using a protein's own leader peptide in heterologous hosts can often 
25 localise the protein to its 'natural' cellular location, in some cases the leader sequence is not 
efficiently recognised by the heterologous host. In such cases, a leader peptide known to 
drive protein targeting efficiently can be used instead. 

Thus the invention provides a method for the heterologous expression of a protein of the 
invention, in which (a) the protein's leader peptide is replaced by the leader peptide from a 
30 different protein and, optionally, (b) no fusion partner is used. 
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The method will typically involve the steps of: obtaining nucleic acid encoding a protein of 
the invention; manipulating said nucleic acid to remove nucleotides that encode the protein's 
leader peptide and to introduce nucleotides that encode a different protein's leader peptide. 
The resulting nucleic acid may be inserted into an expression vector, or may already be part 
5 of an expression vector. The expressed protein will consist of the replacement leader peptide 
at the N-terminus, followed by the protein of the invention minus its leader peptide. 

The leader peptide is preferably from another protein of the invention {e.g. one of SEQ#s 
1-4326), but may also be from an Kcoli protein (e.g. the OmpA leader peptide) or an 
Erwinia carotovora protein (e.g. the PelB leader peptide), for instance. 

10 A particularly useful replacement leader peptide is that of ORF4. This leader is able to direct 
lipidation in Kcoli, improving cellular localisation, and is particularly useful for the 
expression of proteins 287, 919 and AG287. The leader peptide and N-terminal domains of 
961 are also particularly useful. 

Another useful replacement leader peptide is that of E.coli OmpA. This leader is able to 
15 direct membrane localisation of E.coli. It is particularly advantageous for the expression 5f 
ORF1, resulting in a protein which induces better anti-bactericidal antibodies than both 
fusions and protein expressed from its own leader peptide. 

Another useful replacement leader peptide is MKKYLFSAA. This can direct secretion into 
culture medium, and is extremely short and active. The use of this leader peptide is not 
20 restricted to the expression of Neisserial proteins - it may be used to direct the expression of 
any protein (particularly bacterial proteins). 

Leader-peptide deletion 

In a third approach to heterologous expression, the native leader peptide of a protein of the 
invention is deleted. In addition, it is preferred that no fusion partner is used. 

25 Thus the invention provides a method for the heterologous expression of a protein of the 
invention, in which (a) the protein's leader peptide is deleted and, optionally, (b) no fusion 
partner is used. 

The method will typically involve the steps of: obtaining nucleic acid encoding a protein of 
the invention; manipulating said nucleic acid to remove nucleotides that encode the protein's 
30 leader peptide. The resulting nucleic acid may be inserted into an expression vector, or may 
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already be part of an expression vector. The first amino acid of the expressed protein will be 
that of the mature native protein. 

This method can increase the levels of expression. For protein 919, for example, expression 
levels in E.coli are much higher when the leader peptide is deleted. Increased expression 
5 may be due to altered localisation in the absence of the leader peptide. 

The method is preferably used for the expression of 919, ORF46, 961, 050-1, 760 and 287. 
Domain-based expression 

In a fourth approach to heterologous expression, the protein is expressed as domains. This 
may be used in association with fusion systems (e.g. GST or His-tag fusions). 

10 Thus the invention provides a method for the heterologous expression of a protein of the 
invention, in which (a) at least one domain in the protein is deleted and, optionally, (b) no 
fusion partner is used. 

The method will typically involve the steps of: obtaining nucleic acid encoding a protein of 
the invention; manipulating said nucleic acid to remove at least one domain from within the 
15 protein. The resulting nucleic acid may be inserted into an expression vector, or may already 
be part of an expression vector. Where no fusion partners are used, the first amino acid of the 
expressed protein will be that of a domain of the protein. 

A protein is typically divided into notional domains by aligning it with known sequences in 
databases and then determining regions of the protein which show different alignment 
20 patterns from each other. 

The method is preferably used for the expression of protein 287. This protein can be 
notionally split into three domains, referred to as A B & C (see Figure 5). Domain B aligns 
strongly with IgA proteases, domain C aligns strongly with transferrin-binding proteins, and 
domain A shows no strong alignment with database sequences. An alignment of 
25 polymorphic forms of 287 is disclosed in WO00/66741. 

Once a protein has been divided into domains, these can be (a) expressed singly (b) deleted 
from with the protein e.g. protein ABCD -> ABD, ACD, BCD etc. or (c) rearranged e.g. 
protein ABC — ► ACB, CAB etc. These three strategies can be combined with fusion partners 
is desired. 
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ORF46 has also been notionally split into two domains - a first domain (amino acids 1-433) 
which is well-conserved between species and serogroups, and a second domain (amino acids 
433-608) which is not well-conserved. The second domain is preferably deleted. An 
alignment of polymorphic forms of ORF46 is disclosed in WO00/66741. 

5 Protein 564 has also been split into domains (Figure 8), as have protein 961 (Figure 12) and 
protein 502 (amino acids 28-167 of the MC58 protein). 

Hybrid proteins 

In a fifth approach to heterologous expression, two or more (e.g. 3, 4, 5, 6 or more) proteins 
of the invention are expressed as a single hybrid protein. It is preferred that no 
10 non-Neisserial fusion partner (e.g. GST or poly-His) is used. 

This offers two advantages. Firstly, a protein that may be unstable or poorly expressed on its 
own can be assisted by adding a suitable hybrid partner that overcomes the problem. 
Secondly, commercial manufacture is simplified - only one expression and purification need 
be employed in order to produce two separately-useful proteins. 

15 Thus the invention provides a method for the simultaneous heterologous expression of two 
or more proteins of the invention, in which said two or more proteins of the invention are 
fused (i.e. they are translated as a single polypeptide chain). 

The method will typically involve the steps of: obtaining a first nucleic acid encoding a first 
protein of the invention; obtaining a second nucleic acid encoding a second protein of the 
20 invention; ligating the first and second nucleic acids. The resulting nucleic acid may be 
inserted into an expression vector, or may already be part of an expression vector. 

Preferably, the constituent proteins in a hybrid protein according to the invention will be 
from the same strain. 

The fused proteins in the hybrid may be joined directly, or may be joined via a linker peptide 
25 e.g. via a poly-glycine linker (i.e. G n where n = 3, 4, 5, 6, 7, 8, 9, 10 or more) or via a short 
peptide sequence which facilitates cloning. It is evidently preferred not to join a AG protein 
to the C-terminus of a poly-glycine linker. 

The fused proteins may lack native leader peptides or may include the leader peptide 
sequence of the N-terminal fusion partner. 
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The method is well suited to the expression of proteins orfl, orf4, orf25, orf40, Orf46/46.1, 
orf83, 233, 287, 292L, 564, 687, 741, 907, 919, 953, 961 and 983. 

The 42 hybrids indicated by 'X' in the following table of form NH2-A— B-COOH are 
preferred: 





Orf46.1 


287 


741 


919 


953 


961 


983 


ORF46.1 




X 


X 


X 


X 


X 


X 


287 


X 




X 


X 


X 


X 


X 


741 


X 


X 




X 


X 


X 


X 


919 


X 


X 


X 




X 


X 


X 


953 


X 


x ! 


X 


X 




X 


X 


961 


X 


X 


X 


X 


X 




X 


983 


X 


X 


X I 


X 


X 


X 





5 Preferred proteins to be expressed as hybrids are thus ORF46. 1, 287, 741, 919, 953, 961 and 
983. These may be used in their essentially full-length form, or poly-glycine deletions (AG) 
forms may be used (e.g. AG-287, AGTbp2, AG741, AG983 etc.), or truncated forms may be 
used (e.g. Al-287, A2-287 etc.), or domain-deleted versions may be used (e.g. 287B, 287C, 
287BC, ORF46m33, ORF46 4 33^o 8 , ORF46, 961c etc.). 

10 Particularly preferred are: (a) a hybrid protein comprising 919 and 287; (b) a hybrid protein 
comprising 953 and 287; (c) a hybrid protein comprising 287 and ORF46.1; (d) a hybrid 
protein comprising ORF1 and ORF46.1; (e) a hybrid protein comprising 919 and ORF46.1; 
(f) a hybrid protein comprising ORF46.1 and 919; (g) a hybrid protein comprising ORF46.1, 
287 and 919; (h) a hybrid protein comprising 919 and 519; and (i) a hybrid protein 

15 comprising ORF97 and 225. Further embodiments are shown in Figure 14. 

Where 287 is used, it is preferably at the C-terminal end of a hybrid; if it is to be used at the 
N-terminus, if is preferred to use a AG form of 287 is used (e.g. as the N-terminus of a 
hybrid with ORF46.1, 919, 953 or 961). 

Where 287 is used, this is preferably from strain 2996 or from strain 394/98. 

20 Where 961 is used, this is preferably at the N-terminus. Domain forms of 96 1 may be used. 

Alignments of polymorphic forms of ORF46, 287, 919 and 953 are disclosed in 
WO00/66741. Any of these polymorphs can be used according to the present invention. 
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Temperature 

In a sixth approach to heterologous expression, proteins of the invention are expressed at a 
low temperature. 

Expressed Neisserial proteins (e.g. 919) may be toxic to E.coli, which can be avoided by 
5 expressing the toxic protein at a temperature at which its toxic activity is not manifested. 

Thus the present invention provides a method for the heterologous expression of a protein of 
the invention, in which expression of a protein of the invention is carried out at a 
temperature at which a toxic activity of the protein is not manifested. 

A preferred temperature is around 30°C. This is particularly suited to the expression of 919. 
10 Mutations 

As discussed above, expressed Neisserial proteins may be toxic to ExolU This toxicity can 
be avoided by mutating the protein to reduce or eliminate the toxic activity. In particular, 
mutations to reduce or eliminate toxic enzymatic activity can be used, preferably using site- 
directed mutagenesis. 

15 In a seventh approach to heterologous expression, therefore, an expressed protein is mutated 
to reduce or eliminate toxic activity. 

Thus the invention provides a method for the heterologous expression of a protein of the 
invention, in which protein is mutated to reduce or eliminate toxic activity. 

The method is preferably used for the expression of protein 907, 919 or 922. A preferred 
20 mutation in 907 is at Glu-117 (e.g. Glu->Gly); preferred mutations in 919 are at GIu-255 
(e.g. Glu-+Gly) and/or Glu-323 (e.g. Glu->Gly); preferred mutations in 922 are at Glu-164 
(e.g. Glu-+Gly), Ser-213 (e.g. Ser^Gly) and/or Asn-348 (e.g. Asn— Gly). 

Alternative vectors 

In a eighth approach to heterologous expression, an alternative vector used to express the 
25 protein. This may be to improve expression yields, for instance, or to utilise plasmids that are 
already approved for GMP use. 

Thus the invention provides a method for the heterologous expression of a protein of the 
invention, in which an alternative vector is used. The alternative vector is preferably 
pSM214, with no fusion partners. Leader peptides may or may not be included. 
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This approach is particularly useful for protein 953. Expression and localisation of 953 with 
its native leader peptide expressed from pSM214 is much better than from the pET vector. 

pSM214 may also be used with: AG287, A2-287, A3-287, A4-287, Orf46.1, 961L, 961, 
961(MC58), 961c, 961c-L, 919, 953 and AG287-Orf46.1. 

5 Another suitable vector is pET-24b (Novagen; uses kanamycin resistance), again using no 
fusion partners. pET-24b is preferred for use with: AG287K, A2-287K, A3-287K, A4-287K, 
Orf46.1-K, Orf46A-K, 961-K (MC58), 961a-K, 961b-K, 961c-K, 961c-L-K, 961d-K, 
AG287-919-K, AG287-Orf46.1-K and AG287-961-K. 

Multimeric form 

10 In a ninth approach to heterologous expression, a protein is expressed or purified such that it 
adopts a particular multimeric form. 

This approach is particularly suited to protein 953. Purification of one particular multimeric 
form of 953 (the monomelic form) gives a protein with greater bactericidal activity than 
other forms (the dimeric form). 

15 Proteins 287 and 919 may be purified in dimeric forms. 

Protein 961 may be purified in a 180kDa oligomeric form (e.g. a tetramer). 

Lipidation 

In a tenth approach to heterologous expression, a protein is expressed as a lipidated protein. 

Thus the invention provides a method for the heterologous expression of a protein of the 
20 invention, in which the protein is expressed as a lipidated protein. 

This is particularly useful for the expression of 919, 287, ORF4, 406, 576-1, and ORF25. 
Polymorphic forms of 919, 287 and ORF4 are disclosed in WO00/66741. 

The method will typically involve the use of an appropriate leader peptide without using an 
N-terminal fusion partner. 

25 C-terminal deletions 

In an eleventh approach to heterologous expression, the C-tenninus of a protein of the 
invention is mutated. In addition, it is preferred that no fusion partner is used. 
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Thus the invention provides a method for the heterologous expression of a protein of the 
invention, in which (a) the protein's C-terminus region is mutated and, optionally, (b) no 
fusion partner is used. 

The method will typically involve the steps of: obtaining nucleic acid encoding a protein of 
5 the invention; manipulating said nucleic acid to mutate nucleotides that encode the protein's 
C-terminus portion. The resulting nucleic acid may be inserted into an expression vector, or 
may already be part of an expression vector. The first amino acid of the expressed protein 
will be that of the mature native protein. 

The mutation may be a substitution, insertion or, preferably, a deletion. 

10 This method can increase the levels of expression, particularly for proteins 730, ORF29 and 
ORF46. For protein 730, a C-terminus region of around 65 to around 214 amino acids may 
be deleted; for ORF46, the C-terminus region of around 175 amino acids may be deleted; for 
ORF29, the C-terminus may be deleted to leave around 230-370 N-terminal amino acids. 

Leader peptide mutation ;> 

15 In a twelfth approach to heterologous expression, the leader peptide of the protein is 
mutated. This is particularly useful for the expression of protein 919. 

Thus the invention provides a method for the heterologous expression of a protein of the 
invention, in which the protein's leader peptide is mutated. 

The method will typically involve the steps of: obtaining nucleic acid encoding a protein of 
20 the invention; and manipulating said nucleic acid to mutate nucleotides within the leader 
peptide. The resulting nucleic acid may be inserted into an expression vector, or may already 
be part of an expression vector. 

Poly-glycine deletion 

In a thirteenth approach to heterologous expression, poly-glycine stretches in wild-type 
25 sequences are mutated. This enhances protein expression. 

The poly-glycine stretch has the sequence (Gly) n , where n>4 {e.g. 5, 6, 7, 8, 9 or more). This 
stretch is mutated to disrupt or remove the (Gly) n . This may be by deletion {e.g. CGGGGS— ► 
CGGGS, CGGS, CGS or CS), by substitution {e.g. CGGGGS-* CGXGGS, CGXXGS, 
CGXGXS etc.), and/or by insertion {e.g. CGGGGS^ CGGXGGS, CGXGGGS, etc.). 
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This approach is not restricted to Neisserial proteins - it may be used for any protein 
(particularly bacterial proteins) to enhance heterologous expression. For Neisserial proteins, 
however, it is particularly suitable for expressing 287, 741, 983 and Tbp2. An alignment of 
polymorphic forms of 287 is disclosed in WO00/66741. 

5 Thus the invention provides a method for the heterologous expression of a protein of the 
invention, in which (a) a poly-glycine stretch within the protein is mutated. 

The method will typically involve the steps of: obtaining nucleic acid encoding a protein of 
the invention; and manipulating said nucleic acid to mutate nucleotides that encode a poly- 
glycine stretch within the protein sequence. The resulting nucleic acid may be inserted into 
10 an expression vector, or may already be part of an expression vector. 

Conversely, the opposite approach (Le. introduction of poly-glycine stretches) can be used to 
suppress or diminish expression of a given heterologous protein. 

Heterologous host 

Whilst expression of the proteins of the invention may take place in the native host (Le. the 
15 organism in which the protein is expressed in nature), the present invention utilises a 
heterologous host. The heterologous host may be prokaryotic or eukaryotic. It is preferably 
Kcoli, but other suitable hosts include Bacillus subtilis, Vibrio cholerae, Salmonella typhi, 
Salmonenna typhimurium, Neisseria meningitidis, Neisseria gonorrhoeae, Neisseria 
lactamica, Neisseria cinerea, Mycobateria (e.g. M.tuberculosis), yeast etc. 

20 Vectors etc* 

As well as the methods described above, the invention provides (a) nucleic acid and vectors 
useful in these methods (b) host cells containing said vectors (c) proteins expressed or 
expressable by the methods (d) compositions comprising these proteins, which may be 
suitable as vaccines, for instance, or as diagnostic reagents, or as immunogenic compositions 
25 (e) these compositions for use as medicaments (e.g. as vaccines) or as diagnostic reagents (f) 
the use of these compositions in the manufacture of (1) a medicament for treating or 
preventing infection due to Neisserial bacteria (2) a diagnostic reagent for detecting the 
presence of Neisserial bacteria or of antibodies raised against Neisserial bacteria, and/or (3) a 
reagent which can raise antibodies against Neisserial bacteria and (g) a method of treating a 



WO 01/64922 PCT/IB01/00452 

-12- 

patient, comprising administering to the patient a therapeutically effective amount of these 
compositions. 

Sequences 

The invention also provides a protein or a nucleic acid having any of the sequences set out in 
5 the following examples. It also provides proteins and nucleic acid having sequence identity 
to these. As described above, the degree of 'sequence identity' is preferably greater than 
50% (eg. 60%, 70%, 80%, 90%, 95%, 99% or more). 

Furthermore, the invention provides nucleic acid which can hybridise to the nucleic acid 
disclosed in the examples, preferably under "high stringency" conditions (eg. 65°C in a 
10 0. lxSSC, 0.5% SDS solution). 

The invention also provides nucleic acid encoding proteins according to the invention. 

It should also be appreciated that the invention provides nucleic acid comprising sequences 
complementary to those described above (eg. for antisense or probing purposes). 

Nucleic acid according to the invention can, of course, be prepared in many ways (eg. by 
15 chemical synthesis, from genomic or cDNA libraries, from the organism itself etc.) and can 
take various forms (eg. single stranded, double stranded, vectors, probes etc.). 

In addition, the term "nucleic acid" includes DNA and RNA, and also their analogues, such 
as those containing modified backbones, and also peptide nucleic acids (PNA) etc. 

BRIEF DESCRIPTION OF DRAWINGS 

20 Figures 1 and 2 show constructs used to express proteins using heterologous leader peptides. 

Figure 3 shows expression data for ORF1, and Figure 4 shows similar data for protein 961. 
Figure 5 shows domains of protein 287, and Figures 6 & 7 show deletions within domain A. 
Figure 8 shows domains of protein 564. 

Figure 9 shows the PhoC reporter gene driven by the 919 leader peptide, and Figure 10 
25 shows the results obtained using mutants of the leader peptide. 

Figure 11 shows insertion mutants of protein 730 (A: 730-C1; B: 730-C2). 

Figure 12 shows domains of protein 961. 
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Figure 13 shows SDS-PAGE of AG proteins. Dots show the main recombinant product. 
Figure 14 shows 26 hybrid proteins according to the invention. 

MODES FOR CARRYING OUT THE INVENTION 
Example 1 - 919 and its leader peptide 
5 Protein 919 from N.meningitidis (serogroup B, strain 2996) has the following sequence: 

1 MKKYLFRAAL YGIAAAILAA CQSKSIQTFP QPDTSVINGP DRPVGIPDPA 

51 GTTVGGGGAV YTWPHLSLP HWAAQDFAKS LQSFRLGCAN LKNRQGWQDV 

101 CAQAFQTPVH SFQAKQFFER YFTPWQVAGN GSLAGTVTGY YEPVLKGDDR 

151 RTAQARFPIY GIPDDFISVP LPAGLRSGKA LVRIRQTGKN SGTIDNTGGT 

10 201 HTADLSRFPI TARTTAIKGR FEGSRFLPYH TRNQINGGAL DGKAPILGYA 

251 EDPVELFFMH IQGSGRLKTP SGKYIRIGYA DKNEHPYVSI GRYMADKGYL 

301 KLGQTSMQGI KAYMRQNPQR LAEVLGQNPS YIFFRELAGS SNDGPVGALG 

351 TPLMGEYAGA VDRHYITLGA PLFVATAHPV TRKALNRLIM AQDTGSAIKG 

401 AVRVDYFWGY GDEAGELAGK QKTTGYVWQL LPNGMKPEYR P* 

15 The leader peptide is underlined. 

The sequences of 919 from other strains can be found in Figures 7 and 18 of WO00/6674L 

Example 2 of WO99/57280 discloses the expression of protein 919 as a His-fiision in E.colL 
The protein is a good surface-exposed immunogen. 

Three alternative expression strategies were used for 919: 
20 1) 919 without its leader peptide (and without the mature N-terminal cysteine) and 

without any fusion partner (<919 untegged '): 

1 QSKSIQTFP QPDTSVINGP DRPVGIPDPA GTTVGGGGAV YTWPHLSLP 

50 HWAAQDFAKS LQSFRLGCAN LKNRQGWQDV CAQAFQTPVH SFQAKQFFER 

100 YFTPWQVAGN GSLAGTVTGY YEPVLKGDDR RTAQARFPIY GIPDDFISVP 

25 150 LPAGLRSGKA LVRIRQTGKN SGTIDNTGGT HTADLSRFPI TARTTAIKGR 

200 FEGSRFLPYH TRNQINGGAL DGKAPILGYA EDPVELFFMH IQGSGRLKTP 

250 SGKYIRIGYA DKNEHPYVSI GRYMADKGYL KLGQTSMQGI KAYMRQNPQR 

300 LAEVLGQNPS YIFFRELAGS SNDGPVGALG TPLMGEYAGA VDRHYITLGA 

350 PLFVATAHPV TRKALNRLIM AQDTGSAIKG AVRVDYFWGY GDEAGELAGK 

30 400 QKTTGYVWQL LPNGMKPEYR P* 

The leader peptide and cysteine were omitted by designing the 5'-end amplification 
primer downstream from the predicted leader sequence. 

2) 919 with its own leader peptide but without any fusion partner ('919L'); and 
35 3) 9 19 with the leader peptide (mktffktlsaaalalilaa) from ORF4 ('9 19LOrf4*). 

1 MKTFFKTLS AAALALILAA CQSKSIQTFP QPDTSVINGP DRPVGIPDPA 

50 GTTVGGGGAV YTWPHLSLP HWAAQDFAKS LQSFRLGCAN LKNRQGWQDV 

100 CAQAFQTPVH SFQAKQFFER YFTPWQVAGN GSLAGTVTGY YEPVLKGDDR 

150 RTAQARFPIY GIPDDFISVP LPAGLRSGKA LVRIRQTGKN SGTIDNTGGT 

40 200 HTADLSRFPI TARTTAIKGR FEGSRFLPYH TRNQINGGAL DGKAPILGYA 

250 EDPVELFFMH IQGSGRLKTP SGKYIRIGYA DKNEHPYVSI GRYMADKGYL 

300 KLGQTSMQGI KSYMRQNPQR LAEVLGQNPS YIFFRELAGS SNDGPVGALG 
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350 TPLMGEYAGA VDRHYITLGA PLFVATAHPV TRKALNRLIM AQDTGSAIKG 
400 AVRVDYPWGY GDSAGBLAGK QKTTGYVWQL LPNGMKPEYR P* 

To make this construct, the entire sequence encoding the ORF4 leader peptide was 
included in the 5'-primer as a tail (primer 919Lorf4 For). A Nhel restriction site was 
generated by a double nucleotide change in the sequence coding for the ORF4 leader 
(no amino acid changes), to allow different genes to be fused to the ORF4 leader 
peptide sequence. A stop codon was included in all the 3'-end primer sequences. 

All three forms of the protein were expressed and could be purified. 

The '919L' and *919LOrf4' expression products were both lipidated, as shown by the 
incorporation of [^l-palmitate label. 919 unta88ed did not incorporate the 3 H label and was 
located intracellularly. 

919LOrf4 could be purified more easily than 919L. It was purified and used to immunise 
mice. The resulting sera gave excellent results in FACS and ELISA tests, and also in the 
bactericidal assay. The lipoprotein was shown to be localised in the outer membrane. 

919 untagged gave excellent ELISA titres and high serum bactericidal activity. FACS confirmed 
its cell surface location. 

Example 2 - 919 and expression temperature 

Growth of E.coli expressing the 919LOrf4 protein at 37°C resulted in lysis of the bacteria. In 
order to overcome this problem, the recombinant bacteria were grown at 30°C. Lysis was. 
prevented without preventing expression. ^ 

Example 3 - mutation of 907, 919 and 922 

It was hypothesised that proteins 907, 919 and 922 are murein hydrolases, and more 
particularly lytic transglycosylases. Murein hydrolases are located on the outer membrane 
and participate in the degradation of peptidoglycan. 

The purified proteins 919 untagged , 919Lorf4, 919-His (i.e. with a C-terminus His-tag) and 
922-His were thus tested for murein hydrolase activity [Ursinus & Holtje (1994) J.Bact. 
176:338-343]. Two different assays were used, one determining the degradation of insoluble 
murein sacculus into soluble muropeptides and the other measuring breakdown of 
poly^urNAc-GlcNAc^Q glycan strands. 
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The first assay uses murein sacculi radiolabeled with meso-2,6-diamino-3,4,5-[ 3 H]pimelic 
acid as substrate. Enzyme (3-10 |4g total) was incubated for 45 minutes at 37°C in a total 
volume of IOOjjI comprising lOmM Tris-maleate (pH 5.5), lOmM MgCl 2 , 0.2% v/v Triton 
X-100 and [ 3 H]A 2 pm labelled murein sacculi (about lOOOOcpm). The assay mixture was 
5 placed on ice for 15 minutes with 100 \xl of 1% w/v N-acetyl-N,N^-trimethylammonium for 
15 minutes and precipitated material pelleted by centrifugation at 10000# for 15 minutes. 
The radioactivity in the supernatant was measured by liquid scintillation counting. Kcoli 
soluble lytic transglycosylase Slt70 was used as a positive control for the assay; the negative 
control comprised the above assay solution without enzyme. 

10 All proteins except 9 19-His gave positive results in the first assay. 

The second assay monitors the hydrolysis of poly(MurNAc-GlcNAc)glycan strands. Purified 
strands, poly(MurNAc-GlcNAc)n>30 labelled with N-acetyl-D-l-[ 3 H]glucosamine were 
incubated with 3(ig of 919L in 10 mM Tris-maleate (pH 5.5), 10 mM MgCl 2 and 0.2% v/v 
Triton X-100 for 30 min at 37°C. The reaction was stopped by boiling for 5 minutes and the 
15 pH of the sample adjusted to about 3.5 by addition of lOfil of 20% v/v phosphoric acid. 
Substrate and product were separated by reversed phase HPLC on a Nucleosil 300 C 18 
column as described by Harz et al [Anal Biochenu (1990) 190:120-128]. The Kcoli lytic 
transglycosylase Mlt A was used as a positive control in the assay. The negative control was 
performed in the absence of enzyme. 

20 By this assay, the ability of 919LOrf4 to hydrolyse isolated glycan strands was demonstrated 
when anhydrodisaccharide subunits were separated from the oligosaccharide by HPLC. 

Protein 919Lorf4 was chosen for kinetic analyses. The activity of 919Lorf4 was enhanced 
3.7-fold by the addition of 0.2% v/v Triton X-100 in the assay buffer. The presence of Triton 
X-100 had no effect on the activity of 919 untagged . The effect of pH on enzyme activity was 

25 determined in Tris-Maleate buffer over a range of 5.0 to 8.0. The optimal pH for the reaction 
was determined to be 5.5. Over the temperature range 18°C to 42°C, maximum activity was 
observed at 37°C. The effect of various ions on murein hydrolase activity was determined by 
performing the reaction in the presence of a variety of ions at a final concentration of lOmM. 
Maximum activity was found with Mg 2+ , which stimulated activity 2.1-fold. Mn 2+ and Ca 2+ 

30 also stimulated enzyme activity to a similar extent while the addition Ni 2+ and EDTA had no 
significant effect In contrast, both Fe 2+ and Zn 2 + significantly inhibited enzyme activity. 
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The structures of the reaction products resulting from the digestion of unlabelled Kcoli 
murein sacculus were analysed by reversed-phase HPLC as described by Glauner [Anal. 
Biochem. (1988) 172:451-464]. Murein sacculi digested with the muramidase Cellosyl were 
used to calibrate and standardise the Hypersil ODS column. The major reaction products 
5 were 1,6 anhydrodisaccharide tetra and tri peptides, demonstrating the formation of 1,6 
anhydromuraminic acid intramolecular bond. 

These results demonstrate experimentally that 919 is a murein hydrolase and in particular a 
member of the lytic transglycosylase family of enzymes. Furthermore the ability of 922-His 
to hydrolyse murein sacculi suggests this protein is also a lytic transglycosylase. 

10 This activity may help to explain the toxic effects of 919 when expressed in E.coli. 

In order to eliminate the enzymatic activity, rational mutagenesis was used. 907, 919 and 
922 show fairly low homology to three membrane-bound lipidated murein lytic 
transglycosylases from E.coli: 

919 (441aa) is 27.3% identical over 440aa overlap to Kcoli MLTA (P46885); 
15 922 (369aa) is 38.7% identical over 3 lOaa overlap to E.coli MLTB (P41052); and 

907-2 (207aa) is 26.8% identical over 149aa overlap to E.coli MLTC (P52066). 

907-2 also shares homology with Ecoli MLTD (P23931) and Slt70 (P03810), a soluble lytic 
transglycosylase that is located in the periplasmic space. No significant sequence homology 
can be detected among 919, 922 and 907-2, and the same is true among the corresponding 
20 MLTA, MLTB and MLTC proteins. A 

Crystal structures are available for Slt70 [1QTEA; 1QTEB; Thunnissen et al. (1995) 
Biocliemistry 34:12729-12737] and for Slt35 [1LTM; 1QUS; 1QUT; van Asselt et al. (1999) 
Structure Fold Des 7: 1 167-80] which is a soluble form of the 40kDa MLTB. 

The catalytic residue (a glutamic acid) has been identified for both Slt70 and MLTB. 

25 In the case of Slt70, mutagenesis studies have demonstrated that even a conservative 
substitution of the catalytic Glu505 with a glutamine (Gin) causes the complete loss of 
enzymatic activity. Although Slt35 has no obvious sequence similarity to Slt70, their 
catalytic domains shows a surprising similarity. The corresponding catalytic residue in 
MLTB is Glul62. 



L 
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Another residue which is believed to play an important role in the correct folding of the 
enzymatic cleft is a well-conserved glycine (Gly) downstream of the glutamic acid. 
Recently, Terrak et al [MoLMicrobiol. (1999) 34:350-64] have suggested the presence of 
another important residue which is an aromatic amino acid located around 70-75 residues 
5 downstream of the catalytic glutamic acid 

Sequence alignment of Slt70 with 907-2 and of MLTB with 922 were performed in order to 
identify the corresponding catalytic residues in the MenB antigens. 

The two alignments in the region of the catalytic domain are reported below: 
907-2/Slt70: 

10 90 100 110 T120 130 140 

907-2 .pep ERRRLLWIQYESSRAG- -LDTQIVLGLIEVESAFRQYAI SGVGARGLMQVMPFWKNYIG 

Mil:: :| : : « llh : | |Tl Tl I t * I I :: 

slty_ecoli ERF PLAYNDLFKRYT SGKE I PQSYAMAI ARQES AWNPKVKS PVGASGLMQIMPGTATHTV 
480 490 500 ▲ 510 520 530 

15 GLU505 

922/MLTB 

150 160 T 170 180 190 200 

92 2 . pep VAQKYGVPAEL IVAVIGI ETNYGKNTGSFRVADALATLGFDYPRRAGFFQKELVELLKLA 

20 : | UN |:|h:||:|| :h h h I I I I I I : I : I I I I I :|: )| :| :| 

mltb_ecoli AWQVYGVPPEIIVGI IGVETRWGRWQKTRILDAIiATLSFNYPRRAEYFSGELETFLLMA 
150 160 ▲ 170 180 190 200 

GL0162 

25 210 220 230 240 250 260 

922 . pep KEEGGDVFAFKGSYAGAMGMPQFMPSSTRKWATO 

:=l I = 'llhllllh I I I II lT« ' ' I I h : I I I I "I I h 'llllhl 
ml tb_ecoli RDEQDDPLNLKGSFAGAMGYGQFMPS SYKQYAVDFSGDGHINLWDPV- DAIGSVANYFKA 

210 220 230 240 250 260 

30 

From these alignments, it results that the corresponding catalytic glutamate in 907-2 is 
Glull7, whereas in 922 is Glul64. Both antigens also share downstream glycines that could 
have a structural role in the folding of the enzymatic cleft (in bold), and 922 has a conserved 
aromatic residue around 70aa downstream (in bold). 

35 In the case of protein 919, no 3D structure is available for its E.coli homologue MLTA, and 
nothing is known about a possible catalytic residue. Nevertheless, three amino acids in 919 
are predicted as catalytic residues by alignment with MLTA: 

919/MLTA 

240 250 ▼ 260 □ □ 270 □ 280 290 

40 919 . pep ALDGKAPILGYAEDPVELFFMHIQGSGRLKTPSGKYIRI--GYADKNEHPYVSIGRVMADK 

||: I ||:|::: :: |:| : :|: : : :|| II I I llh : h 

mlta_ecoli .p ALSDKY-ILAYSNSLMDNFIMDVQGSGYIDFGDGSPLNFFSYAGKNGHAYRSIGKVLIDR 

170 180 190 200 210 
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300 310 320 ▼ 330O CD 340 0350 0 

919. pep GYLKLGOTSMQGIKSYMRQNPQ^ 

I 0 ' llhh » * : : :: hi I I I I :» I h s : || || ::||:| 

mlta_ecoli.p GEVKKEDMSMQAIRHWGETHSEAEVRELI^QNPSFVFFKPQSFA PVKGASAVPLVG 

5 220 230 240 250 260 270 

360 ▼ o 380 390 400 00410 

919 . pep EYAGAVDRHYITLGAPLFVATAHPVTRKALN RL IMAQDTGS AIKGAVRVDYFWGY 

lft n : • I II I h |:s : : »| ||::| |:|:|||| : | : | 

IV nata_ecoli.p RASVASDRSIIPPGTTI*LAEVPIJiDNNGKFW 

280 290 300 310 320 330 

420 o 
919. pep GDEAGELAGKQKTTGYVWQLLP 

15 I III: II : IIII 

mlta_ecoli.p GPEAGHRAGWYNHYGRVWVLKT 
340 350 

The three possible catalytic residues are shown by the symbol T : 

20 1) Glu255 (Asp in MLTA), followed by three conserved glycines (Gly263, Gly265 and 
Gly272) and three conserved aromatic residues located approximately 75-77 residues 
downstream. These downstream residues are shown by □. 

2) Glu323 (conserved in MLTA), followed by 2 conserved glycines (Gly347 and Gly355) 
and two conserved aromatic residues located 84-85 residues downstream (Tyr406 or 

25 Phe407). These downstream residues are shown by 0. 

3) Asp362 (instead of the expected Glu), followed by one glycine (Gly 369) and a 
conserved aromatic residue (Trp428). These downstream residues are shown by o. 

Alignments of polymorphic forms of 919 are disclosed in WO00/66741. 

Based on the prediction of catalytic residues, three mutants of the 919 and one mutant of 
30 907, containing each a single amino acid substitution, have been generated. The glutamic 
acids in position 255 and 323 and the aspartic acids in position 362 of the 919 protein and 
the glutamic acid in position 117 of the 907 protein, were replaced with glycine residues 
using PCR-based SDM. To do this, internal primers containing a codon change from Glu or 
Asp to Gly were designed: 
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Primers 




Codon change 


919-E255 for 
919-E255 rev 


CGAAGACCCCGTCGgtCTTTTTTTTATG 
GTGCATAAAAAAAAGacrGArnnfifrTrT 


GAA -» Ggt 


919-E323 for 
919-E323 rev 


AACGCCTCGCCGgtGTTTTGGGTCA 
TTTGACCCAAAACacCGGCGAGGTfi 


GAA -> Ggt 


919-D362 for 
919-D362 rev 


TGCCGGCGCAGTCGgtCGGCACTACA 
TAATGTAGTGCCGacCGACTCiCCirrn 


GAC -» Ggt 


907-E117for 
907-E117rev 


TGATTGAGGTGGgtAGCGCGTTCCG 
GGCGGAACGCGCTacCCACCTCAAT 


GAA Ggt 



Underlined nucleotides code for glycine; the mutated nucleotides are in lower case. 

To generate the 919-E255, 919-E323 and 919-E362 mutants, PCR was performed using 
20ng of the pET 919-LOrf4 DNA as template, and the following primer pairs: 



1) Orf4L for / 919-E255 rev 

2) 919-E255for/919Lrev 

3) Orf4L for / 919-E323 rev 

4) 919-E323for/919Lrev 

5) Orf4L for / 919-D362 rev 

6) 919-D362for/919Lrev 

The second round of PCR was performed using the product of PCR 1-2, 3-4 or 5-6 as 
template, and as forward and reverse primers the "Orf4L for" and "919L rev" respectively. 

For the mutant 907-E117, PCR have been performed using 200ng of chromosomal DNA of 
the 2996 strain as template and the following primer pairs: 

7) 907Lfor/907-E117rev 

8) 907-E117for/907Lrev 

The second round of PCR was performed using the products of PCR 7 and 8 as templates 
and the oligos "907L for" and "907L rev" as primers. 

The PCR fragments containing each mutation were processed following the standard 
procedure, digested with Ndel and Xhol restriction enzymes and cloned into pET-21b+ 
vector. The presence of each mutation was confirmed by sequence analysis. 

Mutation of Glul 17 to Gly in 907 is carried out similarly, as is mutation of residues Glul64, 
Ser213andAsn348in922. 
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The E255G mutant of 919 shows a 50% reduction in activity; the E323G mutant shows a 
70% reduction in activity; the E362G mutant shows no reduction in activity. 

Example 4 - multimeric form 

287-GST, 919 umaggcd and 953-His were subjected to gel filtration for analysis of quaternary 
5 structure or preparative purposes. The molecular weight of the native proteins was estimated 
using either FPLC Superose 12 (H/R 10/30) or Superdex 75 gel filtration columns 
(Pharmacia). The buffers used for chromatography for 287, 919 and 953 were 50 mM Tris- 
HC1 (pH 8.0), 20 mM Bicine (pH 8.5) and 50 mM Bicine (pH 8.0), respectively. 

Additionally each buffer contained 150-200 mM NaCl and 10% v/v glycerol. Proteins were 
10 dialysed against the appropriate buffer and applied in a volume of 200p,l. Gel filtration was 
performed with a flow rate of 0.5 - 2.0 ml/min and the eluate monitored at 280nm. Fractions 
were collected and analysed by SDS-PAGE. Blue dextran 2000 and the molecular weight 
standards ribonuclease A, chymotrypsin A ovalbumin, albumin (Pharmacia) were used to 
calibrate the column. The molecular weight of the sample was estimated from a calibration 
15 curve of K^, vs. log M r of the standards. Before gel filtration, 287-GST was digested with 
thrombin to cleave the GST moiety. 

The estimated molecular weights for 287, 919 and 953-His were 73 kDa, 47 kDa and 43 kDa 
respectively. These results suggest 919 is monomeric while both 287 and 953 are principally 
dimeric in their nature. In the case of 953-His, two peaks were observed during gel filtration. 
20 The major peak (80%) represented a dimeric conformation of 953 while the minor peak 
(20%) had the expected size of a monomer. The monomeric form of 953 was found to have 
greater bactericidal activity than the dimer. 

Example 5 - pSM214 and pET-24b vectors 

953 protein with its native leader peptide and no fusion partners was expressed from the pET 
25 vector and also from pSM214 [Velati Bellini et al (1991) J. Biotechnol 18, 177-192]. 

The 953 sequence was cloned as a full-length gene into pSM214 using the E. coli MM294-1 
strain as a host. To do this, the entire DNA sequence of the 953 gene (from ATG to the 
STOP codon) was amplified by PCR using the following primers: 

953L for/2 CCGGAATTCTTATGAAAAAAATCATCTTCGCCGC Eco RI 

30 953L rev/2 GCCCAAGCTTTTATTGTTTGGCTGCCTCGATT Hind HI 
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which contain EcoRI and HindSL restriction sites, respectively. The amplified fragment was 
digested with EcoRI and Hindm and ligated with the pSM214 vector digested with the same 
two enzymes. The ligated plasmid was transformed into Exoli MM294-1 cells (by 
incubation in ice for 65 minutes at 37° C) and bacterial cells plated on LB agar containing 
5 20M,g/ml of chloramphenicol. 

Recombinant colonies were grown over-night at 37°C in 4 ml of LB broth containing 20 
Mg/ml of chloramphenicol; bacterial cells were centrifuged and plasmid DNA extracted as 
and analysed by restriction with EcoRI and HindHl. To analyse the ability of the 
recombinant colonies to express the protein, they were inoculated in LB broth containing 
10 20jig/ml of chloramphenicol and let to grown for 16 hours at 37°C. Bacterial cells were 
centrifuged and resuspended in PBS. Expression of the protein was analysed by SDS-PAGE 
and Coomassie Blue staining. 

Expression levels were unexpectedly high from the pSM214 plasmid. 



Oligos used to clone sequences into pSM-214 vectors were as follows: 



AG287 
(pSM-214) 


Fwd 


ccggaattcttatg-tcgcccgatgttaaatcgck:gga 


EcoRI 


Rev 


GCCCAAGCTT-TCAATCCTGCTCl'llllTGCCG 


Hindm 


A2287 
(pSM-214) 


Fwd 


CCGGAATTCTTATG-AGCCAAGATATGGCGGCAGT 


EcoRI 


Rev 


GCCCAAGCTT-TCAATCCTGCTCTTTTTTGCCG 


Hindm 


A3 287 
(pSM-214) 


Fwd 


CCGGAATTCTTATG-TCCGCCGAATCCGCAAATCA 


EcoRI 


Rev 


GCCCAAGCIT-TCAATCCTGCTCTTTTTTGCCG 


Hindm 


A4287 
(pSM-214) 


Fwd 


CCGGAATTCTTATG-GGAAGGGTTGATTTGGCTAATG 


EcoRI 


Rev 


GCCCAAGCTT-TCAATCCTGCTCTTTTTTGCCG 


Hindm 


Orf46.1 
(pSM-214) 


Fwd 


CCGGAATTCTTATG-TCAGATTTGGCAAACGATTCTT 


EcoRI 


Rev 


GCCCAAGCTT-TTACGTATCATATTTCACGTGCTTC 


Hindffl 


AG287-Orf46.1 
(pSM-214) 


Fwd 


CCGGAATTCTTATG-TCGCCCGATGTTAAATCGGCGGA 


EcoRI 


Rev 


GCCCAAGCTT-TTACGTATCATATTTCACGTGCTTC 


Hindm 


919 
(pSM-214) 


Fwd 


CCGGAATTCTTATG-CAAAGCAAGAGCATCCAAACCT 


EcoRI 


Rev 


GCCCAAGCTT-TTACGGGCGGTATTCGGGCT 


Hindm 


961L 
(pSM-214) 


Fwd 


CCGGAATTCATATG-AAACACTTTCCATCC 


EcoRI 


Rev 


GCCCAAGCTT-TTACCACTCGTAATTGAC 


Hindm 


961 
(pSM-214) 


Fwd 


CCGGAATTCATATG-GCCACAAGCGACGAC 


EcoRI 


Rev 


GCCCAAGCTT-TTACCACTCGTAATTGAC 


Hindm 


961c L 
pSM-214 


Fwd 


CCGGAATTCTTATG-AAACACTTTCCATCC 


EcoRI 


Rev 


GCCCAAGCTT-TCAACCCACGTTGTAAGGTTG 


Hindm 


961c 
pSM-214 


Fwd 


CCGGAATTCTTATG-GCCACAAACGACGACG 


EcoRI 


Rev 


GCCCAAGCTT-TCAACCCACGTTGTAAGGTTG 


Hindm 


953 
(pSM-214) 


Fwd 


CCGGAATTCTTATG-GCCACCTACAAAGTGGACGA 


EcoRI 


Rev 


GCCCAAGCTT-TTATTGTTTGGCTGCCTCGATT 


Hindm 



WO 01/64922 PCT/IB01/00452 

-22- 

These sequences were manipulated, cloned and expressed as described for 953L. 

For the pET-24 vector, sequences were cloned and the proteins expressed in pET-24 as 
described below for pET21. pET2 has the same sequence as pET-21, but with the kanamycin 
resistance cassette instead of ampicillin cassette. 

5 Oligonucleotides used to clone sequences into pET-24b vector were: 



AG 287 K 


rwo 


CijCGuATCCGCTAC^ • 


Nhel 


Rev 


CCCGCTCG AG-TC AATCCTGCTt : r I I i I I nrc * 


Xhol 


A2287K 


Fwd 


CGCGGATCCGCTAGC-CAAGATATnfiCfMTC AOT * 


Nhel 


A3 287 K 


Fwd 


CGCGGATCCGCTAGC-GCCGAATCCGC A A ATP A 8 


Nhel 


A4287K 


Fwd 


CGCGCTAGC-GGAAGGGTTGATTTGGCTAATGG * 


Nhel 


Orf46.1 K 


Fwd 


GGGAATTCCATATG-GGCATTTCCCGCAAAATATC 


Ndel 


Rev 


CCCGCTCGAG-TTACGTATCATATTTCACOTGC: 


Xhol 


Orf46AK 


Fwd 


GGGAATTCCATATG-GGCATTTCCCGCAAAATATC 


Ndel 


Rev 


CCCGCTCGAG-TTATTCTATGCCTTGTnrnnCAT 


Xhol 


961 K 

(JVHJ58) 


Fwd 


CGCGGATCCCATATG-GCCACAAGCGACGACGA 


Ndel 


Rev 


CCCGCTCGAG-TTACCACTCGTAATTGAC 


Xhol 


961a K 


Fwd 


CGCGGATCCCATATG-GCCACAAACGACG 


Ndel 


Rev 


CCCGCTCGAG-TCATTTAGCAATATTATCTTTOTTr' 


Xhol 


961b K 


Fwd 


CGCG<JATCCCAT^TG-AAAGCAAACAGTGCCGAC 


Ndel 


Rev 


CCCGCTCGAG-TTACCACTCGTAATTGAC 


Xhol 


961c K 


Fwd 


CGCGGATCCCATATG-GCCACAAACGACG 


Ndel 


Rev 


CCCGCTCGAG-TTAACCCACGTTGTAAGGT 


Xhol 


961cLK 


Fwd 


CGCGGATCCCATATG-ATGAAACACTTTCCATCC 


Ndel 


Rev 


CCCGCTCGAG-TTAACCCACGTTGTAAGGT 


Xhol 


961dK 


Fwd 


CGCGGATCCCATATG-GCCACAAACGACG 


Ndel 


Rev 


CCCGCTCGAG-TCAGTCTGACACTGTTTTATCC 


Xhol 


| AG 287- 
919 K 


Fwd 


CGCGGATCCGCTAGC-CCCGATGTTAAATCGGC 


Nhel 


Rev 


CCCGCTCGAG-TTACGGGCGGTATTCGG 


Xhol 


AG 287- 
Orf46.1 K 


Fwd 


CGCGGATCCGCTAGC-CCCGATGTTAAATCGGC 


Nhel 


Rev 


CCCGCTCGAG-TTACGTATCATATTTCACGTGC 


Xhol 


AG 287- 
961 K 


Fwd 


CGCGGATCCGCTAGC-CCCGATGTTAAATCGGC 


Nhel 


Rev 


CCCGCTCGAG-TTACCACTCGTAATTGAC 


Xhol 



* This primer was used as a Reverse primer for all the 287 forms. 
§ Forward primers used in combination with the AG278 K reverse primer. 

Example 6 - ORF1 and its leader peptide 

ORF1 from N. meningitidis (serogroup B, strain MC58) is predicted to be an outer membrane 
10 or secreted protein. It has the following sequence: 

1 MKTTDKRTTE THRKAPKTGR IRFSPAYLAI CLSFGILPQA WAGHTYFGIN 



WO 01/64922 



-23- 



PCT/IB01/00452 



10 



15 



20 



25 



51 
101 
151 
201 
251 
301 
351 
401 
451 
501 
551 
601 
651 
701 
751 
801 
851 
901 
951 
1001 
1051 
1101 
1151 
1201 
1251 
1301 
1351 
1401 
1451 



YQYYRDFAEN 
VAALVGDQYI 
KGHPYGGDYH 
RQYWRSDEDE 
KHSPYGFLPT 
QLVRKDWFYD 
LPNRLKTRTV 
GKGELILTSN 
VNGVANDRLS 
FSEIGLVSGR 
DEGAMIVNHN 
TTKTNGRLNL 
YNHLNDHWSQ 
VKGDWHLSNH 
LTKTDISGNV 
SLVGNAQATF 
HSALNGNVSL 
GNLNLDNATI 
SVESRFNTLT 
NTGNEPASLE 
EFRLHNPVKE 
VAEPARQAGG 
ARRARKDLPQ 
RVFAEDRRNA 
GILFSHNRTE 
SLSDGIGGKI 
ENVNIATPGL 
TRVNTAVLAQ 
IKLGYRW* 



KGKFAVGAKD 
VSVAHNGGYN 
MPRLHKFVTD 
PNNRESSYHI 
GGSFGDSGSP 
EIFAGDTHSV 
QLFNVSLSET 
INQGAGGLYF 
KIGKGTLHVQ 
GTVQLNADNQ 
QDKESTVTIT 
VYQPAAEDRT 
KEGIPRGEIV 
AQAVFGVAPH 
DLADHAHLNL 
NQATLNGNTS 
ADKAVFHFES 
TLNSAYRHDA 
VNGKLNGQGT 
QLTWEGKDN 
QELSDKLGKA 
ENVGIMQAEE 
LQPQPQPQPQ 
VWTSGIRDTK 
NTFDDGIGNS 
RRRVLHYGIQ 
AFNRYRAGIK 
DFGKTRSAEW 



IEVYNKKGEL 
NVDFGAEGRN 
AEPVEMTSYM 
ASAYSWLVGG 
MFIYDAQKQK 
FYEPRQNGKY 
AREPVYHAAG 
QGDFTVSPEN 
AKGENQGSIS 
FNPDKLYFGF 
GNKDIATTGN 
LLLSGGTNLN 
WDNDWINRTF 
QSHTICTRSD 
TGLATLNGNL 
ASGNASFNLS 
SRFTGQISGG 
AGAQTGSATD 
FRFMSELFGY 
KPLSENLNFT 
EAKKQAEKDN 
EKKRVQADKD 
RDLISRYANS 
HYRSQDFRAY 
ARLAHGAVFG 
ARYRAGFGGF 
ADYSFKPAQH 
GVNAEIKGFT 



VGKSMTKAPM 
PDQHRFTYKI 
DGRKYIDQNN 
NTFAQNGSGG 
WLINGVLQTG 
SFNDDNNGTG 
GVNSYRPRLN 
NETWQGAGVH 
VGDGTVILDQ 
RGGRLDLNGH 
NNSLDSKKEI 
GNITQTNGKL 
KAENFQIKGG 
WTGLTNCVEK 
SANGDTRYTV 
DHAVQNGSLT 
KDTALHLKDS 
APRRRSRRSR 
RSDKLKLAES 
LQNEHVDAGA 
AQSLDALIAA 
TALAKQREAE 
GLSEFSATLN 
RQQTDLRQIG 
QYGIDRFYIG 
GIEPHIGATR 
ISITPYLSLS 
LSLHAAAAKG 



IDFSWSRNG 
VKRNNYKAGT 
YPDRVRIGAG 
GTVNLGSEKI 
NPYIGKSNGF 
KINAKHEHNS 
NGENISFIDE 
ISEDSTVTWK 
QADDKGKKQA 
SLSFHRIQNT 
AYNGWFGEKD 
FFSGRPTPHA 
QAWSRNVAK 
TITDDKVIAS 
SHNATQNGNL 
LSGNAKANVS 
EWTLPSGTEL 
RSLLSVTPPT 
SEGTYTLAVN 
WRYQLIRKDG 
GRDAVEKTES 
TRPATTAFPR 
SVFAVQDELD 
MQKNLGSGRV 
ISAGAGFSSG 
YFVQKADYRY 
YTDAASGKVR 
PQLEAQHSAG 



30 The leader peptide is underlined. 



35 



40 



45 



50 



55 



A polymorphic form of ORF1 is disclosed in W099/55873. 

Three expression strategies have been used for ORF1 : 

1) ORF1 using a His tag, following W099/24578 (ORFl-His); 

2) ORF1 with its own leader peptide but without any fusion partner ('ORF1L'); and 

3) ORF1 with the leader peptide (mkktaiaiavalagfatvaqa) from Exoli OmpA 
COrflLOmpA*): 

MKKTAIAIAVAIAGFAWAQAA SAGHTYFGINYOYYR^ 

VSRNGVAALVGDQYIVSTVAHNGGYNNVDTC 

PVEMTSYMIX5RKYIDQNNYPDRVRIGAGRQYWRSDEDEPNNRES 

IKHSPYGFLPTGGSFGDSGSPMF IYDAQKQKWLINGVLQTGNPYIGKSNGFQLVRKDWFYDE IFAGDTHSVFYEPRQ 
NGKYSFNDDNNGTGKINAKHEHNSLPNRIJCTRWQLFOT 

ELILTSNINQGAGGLYFQGDFTVSPENNETWQGAGVHISEDST^ I s 

VGDGTVILDQQADDKGKKQAFSEIGLVSGRGWQLNADNQFNPDKLYFGFRGGRLDLNGHS 

NHNQDKESTVTITGNKDIATTGNNNSLDSKKEIAYNGWFGEKDTTKTO 

QTNGKLFFSGRPTPHAYNHLNDHWSQKEG I PRGEIVWDNDWINRTFKAENFQI KGGQAWSRNVAKVKGDWHL SNHA 
QAVFGVAPHQSHTICTRSDWTGLTNCVEKTI^^ 

WSHNATQNGNL SLVGNAQATFNQATLNGNTSASGl^SFNL SDHAVQNGSLTL SGNAKANVSHSALNGNVSLADKAV 

FHFESSRFTGQISGGKOTALHLKDSEWTLPSGTELGNLNLDN^^ 

LLSVTPPTSVESRFNTLTVNGKLNGQGTFRFMSELFGYRSDKLKLAESSEGTYTLA 

NKPL SENLNFTLQNEHVDAGAWRYQIi I RKIX3EFRLHNPVKEQELSDKLGKAEAKKQAEKDNAQSLDAL I AAGRDAVE 

KTESVAEPARQAGGENVGIMQAEEEKKRVQADKDTALAKQREAETRPATTAFPRARRARra 

ISRYANSGLSEFSATLNSWAVQDELDRVFAEDRRNAVWTSGIRDTKHYRSQDFRAYR 

GILF SHNRTENTFDDGIGNSARLAHGAVFGQYGIDRFYIGI S AGAGF S SGSLSDG IGGK IRRRVLHYG I QARYRAGF 
GGFGIEPHIGATRYFVQKADYRYENVNIATPGLAFNRY 

TAVLAQDFGKTRSAEWGVNAEIKGFTLSLHAAAAKGPQLEAQHSAGIKLGYRW* 
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To make this construct, the clone pET91 lLOmpA (see below) was digested with the 
Nhel and Xhol restriction enzymes and the fragment corresponding to the vector 
carrying the OmpA leader sequence was purified (pETLOmpA). The ORF1 gene 
coding for the mature protein was amplified using the oligonucleotides ORFl-For 
5 and ORFl-Rev (including the Nhel and Xhol restriction sites, respectively), digested 

with Nhel and Xhol and ligated to the purified pETOmpA fragment (see Figure 1). 
An additional AS dipeptide was introduced by the Nhel site. 

All three forms of the protein were expressed. The His-tagged protein could be purified and 
was confirmed as surface exposed, and possibly secreted (see Figure 3). The protein was 
10 used to immunise mice, and the resulting sera gave excellent results in the bactericidal assay. 

ORFlLOmpA was purified as total membranes, and was localised in both the inner and 
outer membranes. Unexpectedly, sera raised against ORFlLOmpA show even better ELESA 
and anti-bactericidal properties than those raised against the His-tagged protein. 

ORF1L was purified as outer membranes, where it is localised. 

15 Example 7 - protein 911 and its leader peptide 

Protein 911 from N. meningitidis (serogroup B, strain MC58) has the following sequence: 

1 MKKNILEFWV GLFVLIGAAA VAFLA FRVAG GAAFGGSDKT YAVYADFGDI 

51 GGLKVNAPVK SAGVLVGRVG A1GLDPKSYQ ARVRLDLDGK YQFSSDVSAQ 

101 ILTSGLLGEQ YIGLQQGGDT ENLAAGDTIS VTSSAMVLEN LIGKFMTSFA 

20 151 EKNADGGNAE KAAE* 

The leader peptide is underlined. 

Three expression strategies have been used for 91 1: 

1) 911 with its own leader peptide but without any fusion partner ('91 1L'); 

2) 911 with the leader peptide from Exoli OmpA ('911LOmpA , ). 

25 To make this construct, the entire sequence encoding the OmpA leader peptide was 

included in the 5*- primer as a tail (primer 911LOmpA Forward). A Nhel restriction 
site was inserted between the sequence coding for the OmpA leader peptide and the 
911 gene encoding the predicted mature protein (insertion of one amino acid, a 
serine), to allow the use of this construct to clone different genes downstream the 

30 OmpA leader peptide sequence. 

3) 911 with the leader peptide (mkyllptaaaglllaaqpama) from Erwinia carotovora 
PelB ('911LpelB'). 
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To make this construct, the 5 '-end PCR primer was designed downstream from the 
leader sequence and included the Ncol restriction site in order to have the 911 fused 
directly to the PelB leader sequence; the 3'- end primer included the STOP codon. 
The expression vector used was pET22b+ (Novagen), which carries the coding 
5 sequence for the PelB leader peptide. The Ncol site introduces an additional 

methionine after the PelB sequence. 

All three forms of the protein were expressed. ELISA titres were highest using 91 1L, with 
919LOmpA also giving good results. 

Example 8 -ORF46 

10 The complete ORF46 protein from N. meningitidis (serogroup B, strain 2996) has the 
following sequence: 

1 LGISRKISLI LSIIAVCLPM HAHASDLAND SFIRQVLDRQ HFEPDGKYHL 

51 FGSRGELAER SGHIGLGKIQ SHQLGNLMIQ QAAIKGNIGY IVRFSDHGHE 

101 VHSPFDNHAS HSDSDEAGSP VDGFSLYRIH WDGYEHHPAD GYDGPQGGGY 

15 151 PAPKGARDIY SYDIKGVAQN IRLNLTDNRS TGQRLADRFH NAGSMLTQGV 

201 GDGFKRATRY SPELDRSGNA AEAFNGTADI VKNIIGAAGE IVGAGDAVQG 

251 ISEGSNIAVM HGLGLLSTEN KMARINDLAD MAQLKDYAAA AIRDWAVQNP 

301 NAAQGIEAVS NIFMAAIPIK GIGAVRGKYG LGGITAHPIK RSQMGAIALP 

351 KGKSAVSDNF ADAAYAKYPS PYHSRNIRSN LEQRYGKENI TSSTVPPSNG 

20 401 KNVKLADQRH PKTGVPFDGK GFPNFEKHVK YDTKLDIQEL SGGGIPKAKP 

451 VSDAKPRWEV DRKLNKLTTR EQVEKNVQEI RNGNKNSNFS QHAQLEREIN 

501 KLKSADEINF ADGMGKFTDS MNDKAFSRLV KSVKENGFTN PWEYVEING 

551 KAYIVRGNNR VFAAEYLGRI HELKFKKVDF PVPNTSWKNP TDVLNESGNV 

601 KRPRYRSK* 

25 

The leader peptide is underlined. 

The sequences of ORF46 from other strains can be found in WOOO/66741. 

Three expression strategies have been used for ORF46: 

1) ORF46 with its own leader peptide but without any fusion partner ('ORF46-2L'); 
30 2) ORF46 without its leader peptide and without any fusion partner ('ORF46-2'), with 

the leader peptide omitted by designing the 5 ! -end amplification primer downstream 
from the predicted leader sequence: 

1 SDLANDSFIR QVLDRQHFEP DGKYHLFGSR GELAERSGHI GI1GKIQSHQI1 

51 GNLMIQQAAI KGNIGYIVRF SDHGHEVHSP FDNHASHSDS DEAGSPVDGF 

35 101 SLYRIHWDGY EHHPADGYDG PQGGGYPAPK GARDIYSYDI KGVAQNIRLN 

151 LTDNRSTGQR LADRFHNAGS MLTQGVGDGF KRATRYSPEL DRSGNAAEAF 

201 NGTADIVKNI IGAAGEIVGA GDAVQGISEG SNIAVMHGLG LLSTENKMAR 

251 INDLADMAQL KDYAAAAIRD WAVQNPNAAQ GIEAVSNIFM AAIPIKGIGA 

301 VRGKYGLGGI TAHPIKRSQM GAIALPKGKS AVSDNFADAA YAKYPSPYHS 

40 351 RNIRSNLEQR YGKENITSST VPPSNGKNVK LADQRHPKTG VPFDGKGFPN 

401 FEKHVKYDTK LDIQELSGGG IPKAKPVSDA KPRWEVDRKL NKLTTREQVS 

451 KNVQEIRNGN KNSNFSQHAQ LEREINKLKS ADEINFADGM GKFTDSMNDK 

501 AFSRLVKSVK ENGFTNPWE YVEINGKAYI VRGNNRVFAA EYLGRIHELK 

551 FKKVDFPVPN TSWKNPTDVL NESGNVKRPR YRSK* 
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3) ORF46 as a truncated protein, consisting of the first 433 amino acids ('ORF46.1L'), 
constructed by designing PCR primers to amplify a partial sequence corresponding 
to aa 1-433. 

5 A STOP codon was included in the 3* -end primer sequences. 

ORF46-2L is expressed at a very low level to Rcoli. Removal of its leader peptide 
(ORF46-2) does not solve this problem. The truncated ORF46.1L form (first 433 amino 
acids, which are well conserved between serogroups and species), however, is 
well-expressed and gives excellent results in ELISA test and in the bactericidal assay. 

10 ORF46.1 has also been used as the basis of hybrid proteins. It has been fused with 287, 919, 
and ORF1. The hybrid proteins were generally insoluble, but gave some good ELISA and 
bactericidal results (against the homologous 2996 strain): 



Protein 


ELISA 


Bactericidal Ab 


Orfl-Orf46.1-His 


850 


256 


919-Orf46.1-His 


12900 


512 


919-287-Orf46-His 


n.d. 


n.d. 


Orf46.1-287His 


150 


8192 


Orf46.1-919His 


2800 


2048 


Orf46.1-287-919His 


3200 


16384 



For comparison, 'triple' hybrids of ORF46.1, 287 (either as a GST fusion, or in AG287 
form) and 919 were constructed and tested against various strains (including the homologous 
15 2996 strain) versus a simple mixture of the three antigens. FCA was used as adjuvant: 





2996 


BZ232 


MC58 


NGH38 


F6124 


BZ133 


Mixture 


8192 


256 


512 


1024 


>2048 


>2048 


ORF46.1-287-919his 


16384 


256 


4096 


8192 


8192 


8192 


AG287-919-ORF46.1his 


8192 


64 


4096 


8192 


8192 


16384 


AG287-ORF46.1-919his 


4096 


128 


256 


8192 


512 


1024 



Again, the hybrids show equivalent or superior immunological activity. 



Hybrids of two proteins (strain 2996) were compared to the individual proteins against 
various heterologous strains: 
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1000 


MC58 


F6124 (MenA) 


ORF46.1-His 


<4 


4096 


<4 


ORFl-His 


8 


256 


128 


ORF1— ORF46.1-His 


1024 


512 


1024 



Again, the hybrid shows equivalent or superior immunological activity. 



Example 9 - protein 961 

The complete 961 protein from N.meningitidis (serogroup B, strain MC58) has the following 
sequence: 

5 1 MSMKHFPAKV LTTAILATFC SGALAA TSDD DVKKAATVAI VAAYNNGQEI 

51 NGFKAGETIY DIGEDGTITQ KDATAADVEA DDFKGLGLKK WTNLTKTYN 
101 ENKQNVDAKV KAAESEIEKL TTKLADTDAA LADTDAALDE TTNALNKLGE 
151 NITTFAEETK TNIVKIDEKL EAVADTVDKH AEAFNDIADS LDETNTKADE 
201 AVKTANEAKQ TAEETKQNVD AKVKAAETAA GKAEAAAGTA NTAADKAEAV 
10 251 AAKVTDIKAD IATNKADIAK NSARIDSUDK NVANLRKETR QGLAEQAALS 

301 GLFQPYNVGR FNVTAAVGGY KSESAVAIGT GFRFTENFAA KAGVAVGTSS 
351 GSSAAYHVGV NYEW* 

The leader peptide is underlined. 

15 Three approaches to 961 expression were used: 

1) 961 using a GST fusion, following WO99/57280 ('GST961'); 

2) 961 with its own leader peptide but without any fusion partner ('961L'); and 

3) 961 without its leader peptide and without any fusion partner (<961 untagged '), with the 
leader peptide omitted by designing the 5 ! -end PCR primer downstream from the 

20 predicted leader sequence. 

All three forms of the protein were expressed. The GST-fusion protein could be purified and 
antibodies against it confirmed that 961 is surface exposed (Figure 4). The protein was used 
to immunise mice, and the resulting sera gave excellent results in the bactericidal assay. 
961L could also be purified and gave very high ELISA titres. 

25 Protein 961 appears to be phase variable. Furthermore, it is not found in all strains of 
N.meningitidis. 

Example 10 - protein 287 

Protein 287 from N.meningitidis (serogroup B, strain 2996) has the following sequence: 

1 MFERSVIAMA CIFALSA CGG GGGGSPDVKS ADTLSKPAAP WAEKETEVK 

30 51 EDAPQAGSQG QGAPSTQGSQ DMAAVSAENT GNGGAATTDK PKNEDEGPQN 

101 DMPQNSAESA NQTGNNQPAD SSDSAPASNP APANGGSNFG RVDLANGVLI 

151 DGPSQNITLT HCKGDSCNGD NLLDEEAPSK SEFENLNESE RIEKYKKDGK 
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201 SDKFTNLVAT AVQANGTNKY VIIYKDKSAS SSSARFRRSA RSRRSLPAEM 
251 PLIPVNQADT LXVDGEAVSL TGHSGNIFAP EGNYRYLTYG AEKLPGGSYA 
301 LRVQGEPAKG EMLAGTAVYN GEVLHFHTEN GRPYPTRGRF AAKVDFGSKS 
351 VDGIIDSGDD LHMGTQKFKA AIDGNGFKGT WTENGGGDVS GRFYGPAGEE 
5 401 VAGKYSYRPT DAEKGGFGVF AGKKEQD* 

The leader peptide is shown underlined. 

The sequences of 287 from other strains can be found in Figures 5 and 15 of WO00/66741. 
Example 9 of WO99/57280 discloses the expression of 287 as a GST-fusion in E.colL 

10 A number of further approaches to expressing 287 in E.coli have been used, including: 

1) 287 as a His-tagged fusion ('287-His'); 

2) 287 with its own leader peptide but without any fusion partner ( 4 287L'); 

3) 287 with the ORF4 leader peptide and without any fusion partner ('287LOrf4'); and 

4) 287 without its leader peptide and without any fusion partner ( t 287 untagged, )r 

15 1 CGGGGGGSPD VKSADTLSKP AAPWAEKET EVKEDAPQAG SQGQGAPSTQ 

51 GSQDMAAVSA ENTGNGGAAT TDKPKNEDEG PQNDMPQNSA ESANQTGNNQ 

101 PADSSDSAPA SNPAPANGGS NFGRVDLANG VLIDGPSQNI TLTHCKGDSC 

151 NGDNLLDEEA PSKSEFENLN ESERIEKYKK DGKSDKFTNL VATAVQANGT 

201 NKYVIIYKDK SASSSSARFR RSARSRRSLP AEMPLIPVNQ ADTLIVDGEA : 

20 251 VSLTGHSGNI FAPEGNYRYL TYGAEKLPGG SYALRVQGEP AKGEMLAGTA 

301 VYNGEVLHFH TENGRPYPTR GRFAAKVDFG SKSVDGIIDS GDDLHMGTQK 

351 FKAAIDGNGF KGTWTENGGG DVSGRFYGPA GEEVAGKYSY RPTDAEKGGF 

401 GVFAGKKEQD * 

25 All these proteins could be expressed and purified. 

'287L' and '287LOrf4' were confirmed as lipoproteins. 

As shown in Figure 2, *287LOrf4 > was constructed by digesting 919LOrf4 with Nhel and 
Xhol. The entire ORF4 leader peptide was restored by the addition of a DNA sequence 
coding for the missing amino acids, as a tail, in the 5'-end primer (287LOrf4 for), fused to 
30 287 coding sequence. The 287 gene coding for the mature protein was amplified using the 
oligonucleotides 287LOrf4 For and Rev (including the Nhel and Xhol sites, respectively), 
digested with Nhel and Xhol and ligated to the purified pETOrf4 fragment. 

Example 11 - further non-fusion proteins with/without native leader peptides 

A similar approach was adopted for E.coli expression of further proteins from W099/24578, 
35 WO99/36544andWO99/57280. 
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The following were expressed without a fusion partner: 008, 105, 117-1, 121-1, 122-1, 128- 
1, 148, 216, 243, 308, 593, 652, 726, 982, and Orfl43-l. Protein 117-1 was confirmed as 
surface-exposed by FACS and gave high ELISA titres. 

The following were expressed with the native leader peptide but without a fusion partner: 
5 111, 149, 206, 225-1, 235, 247-1, 274, 283, 286, 292, 401, 406, 502-1, 503, 519-1, 525-1, 
552, 556, 557, 570, 576-1, 580, 583, 664, 759, 907, 913, 920-1, 926, 936-1, 953, 961, 983, 
989, Orf4, Orf7-l, Orf9-l, Orf23, Orf25, Orf37, Orf38, Orf40, Qrf40.1, Orf4&2, Orf72-l, 
Orf76-l, Orf85-2, Orf91, Orf97-l, Orfl 19, Orfl43.1. These proteins are given the suffix 'L\ 

His-tagged protein 760 was expressed with and without its leader peptide. The deletion of 
10 the signal peptide greatly increased expression levels. The protein could be purified most 
easily using 2M urea for solubilisation. 

His-tagged protein 264 was well-expressed using its own signal peptide, and the 30kDa 
protein gave positive Western blot results. 

All proteins were successfully expressed. 

15 The localisation of 593, 121-1, 128-1, 593, 726, and 982 in the cytoplasm was confirmed. 

The localisation of 920-1L, 953L, ORF9-1L, ORF85-2L, ORF97-1L, 570L, 580L and 664L 
in the periplasm was confirmed. 

The localisation of ORF40L in the outer membrane, and 008 and 519-1L in the inner 
membrane was confirmed. ORF25L, ORF4L, 406L, 576- 1L were all confirmed as being 
20 localised in the membrane. 

Protein 206 was found not to be a lipoprotein. 

ORF25 and ORF40 expressed with their native leader peptides but without fusion partners, 
and protein 593 expressed without its native leader peptide and without a fusion partner, 
raised good anti-bactericidal sera. Surprisingly, the forms of ORF25 and ORF40 expressed 
25 without fusion partners and using their own leader peptides (Le. 'ORF25L* and 'ORF40L') 
give better results in the bactericidal assay than the fusion proteins. 

Proteins 920L and 953L were subjected to N-terminal sequencing, giving hrvwvetah and 
atykvdeyhanarfaf, respectively. This sequencing confirms that the predicted leader 
peptides were cleaved and, when combined with the periplasmic location, confirms that the 
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proteins are correctly processed and localised by Rcoli when expressed from their native 
leader peptides. 

The N-terminal sequence of protein 519.1L localised in the inner membrane was meffiilla, 
indicating that the leader sequence is not cleaved. It may therefore function as both an 
5 uncleaved leader sequence and a transmembrane anchor in a manner similar to the leader 
peptide of PBP1 from N. gonorrhoeae [Ropp & Nicholas (1997) /. Bact. 179:2783-2787.]. 
Indeed the N-terminal region exhibits strong hydrophobic character and is predicted by the 
Tmpred. program to be transmembrane. 

Example 12 - lipoproteins 

10 The incorporation of palmitate in recombinant lipoproteins was demonstrated by the method 
of Kraft et. al [I Bact. (1998) 180:3441-3447.]. Single colonies harbouring the plasmid of 
interest were grown overnight at 37°C in 20 ml of LB/Amp (100ng/ml) liquid culture. The 
culture was diluted to an OD 55 o of 0.1 in 5.0 ml of fresh medium LB/Amp medium 
containing 5 jiC/ml pH] palmitate (Amersham). When the OD550 of the culture reached 0.4- 

15 0.8, recombinant lipoprotein was induced for 1 hour with IPTG (final concentration 1.0 
mM). Bacteria were harvested by centrifugation in a bench top centrifuge at 2700g for 15 
min and washed twice with 1.0 ml cold PBS. Cells were resuspended in 120nl of 20 mM 
Tris-HCl (pH 8.0), 1 mM EDTA, 1.0% w/v SDS and lysed by boiling for 10 min. After 
centrifugation at 13000g for 10 min the supernatant was collected and proteins precipitated 

20 by the addition of 1.2 ml cold acetone and left for 1 hour at -20 °C. Protein was pelleted by 
centrifugation at 13000g for 10 min and resuspended in 20-50pl (calculated to standardise 
loading with respect to the final O.D of the culture) of 1.0% w/v SDS. An aliquot of 15 ^il 
was boiled with 5|il of SDS-PAGE sample buffer and analysed by SDS-PAGE. After 
electrophoresis gels were fixed for 1 hour in 10% v/v acetic acid and soaked for 30 minutes 

25 in Amplify solution (Amersham). The gel was vacuum-dried under heat and exposed to 
Hyperfilm (Kodak) overnight -80 °C. 

Incorporation of the [ 3 H] palmitate label, confirming lipidation, was found for the following 
proteins: Orf4L, Orf25L, 287L, 287LOrf4, 406.L, 576L, 926L, 919L and 919LOrf4. 

Example 13 - domains in 287 

30 Based on homology of different regions of 287 to proteins that belong to different functional 
classes, it was split into three 'domains', as shown in Figure 5. The second domain shows 
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homology to IgA proteases, and the third domain shows homology to transferrin-binding 
proteins. 

Each of the three 'domains' shows a different degree of sequence conservation between 
N.meningitidis strains - domain C is 98% identical, domain A is 83% identical, whilst 
5 domain B is only 71% identical. Note that protein 287 in strain MC58 is 61 amino acids 
longer than that of strain 2996. An alignment of the two sequences is shown in Figure 7, and 
alignments for various strains are disclosed in WO00/66741 (see Figures 5 and 15 therein). 

The three domains were expressed individually as C-terminal His-tagged proteins. This was 
done for the MC58 and 2996 strains, using the following constructs: 

10 287a-MC58 (aa 1-202), 287b-MC58 (aa 203-288), 287c-MC58 (aa 3 1 1-488). 

287a-2996 (aa 1-139), 287b-2996 (aa 140-225), 287c-2996 (aa 250-427). 

To make these constructs, the stop codon sequence was omitted in the 3'-end primer 
sequence. The 5' primers included the Nhel restriction site, and the 3' primers included a 
Xhol as a tail, in order to direct the cloning of each amplified fragment into the expression 
15 vector pET21b+ using Ndel-Xhol, Nhel-Xhol or NdehHindSL restriction sites. 

All six constructs could be expressed, but 287b-MC8 required denaturation and refolding for 
solubilisation. 

Deletion of domain A is described below (' A4 287-His'). 

Immunological data (serum bactericidal assay) were also obtained using the various domains 
20 from strain 2996, against the homologous and heterologous MenB strains, as well as MenA 
(F6124 strain) and MenC (BZ133 strain): 





2996 


BZ232 


MC58 


NGH38 


394/98 


MenA 


MenC 


287-His 


32000 


16 


4096 


4096 


512 


8000 


16000 


287(B)-His 


256 










16 




287(C)-His 


256 




32 


512 


32 


2048 


>2048 


287(B-C)-His 


64000 


128 


4096 


64000 


1024 


64000 


32000 



Using the domains of strain MC58, the following results were obtained: . 
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MC58 


2996 


BZ232 


NGH38 


394/98 


lTACILT* 




287-His 


4096 


32000 


16 


4096 


512 


8000 


16000 


287(B)-His 


128 


128 










128 


287(Q-His 




16 




1024 




512 




287(B-Q-His 


16000 


64000 


128 


64000 


512 


64000 


>8000 



Example 14 - deletions in 287 

As well as expressing individual domains, 287 was also expressed (as a C-terminal 
His-tagged protein) by making progressive deletions within the first domain. These 

Four deletion mutants of protein 287 from strain 2996 were used (Figure 6): 
5 1) *287-His\ consisting of amino acids 18-427 (Le. leader peptide deleted); 

2) 6 A1 287-His', consisting of amino acids 26-427; 

3) 'A2 287-His', consisting of amino acids 70-427; 

4) 'A3 287-His', consisting of amino acids 107-427; and 

5) 'A4 287-His', consisting of amino acids 140-427 (=287-bc). 

10 The 'A4' protein was also made for strain MC58 (' A4 287MC58-His'; aa 203-488). 

The constructs were made in the same way as 287a/b/c, as described above. 

All six constructs could be expressed and protein could be purified. Expression of 287-His 
was, however, quite poor. 

Expression was also high when the C-terminal His-tags were omitted. 

15 Immunological data (serum bactericidal assay) were also obtained using the deletion 
mutants, against the homologous (2996) and heterologous MenB strains, as well as MenA 
(F6124 strain) and MenC (BZ133 strain): 





2996 


BZ232 


MC58 


NGH38 


394/98 


MenA 


MenC 


287-his 


32000 


16 


4096 


4096 


512 


8000 


16000 


Al 287-His 


16000 


128 


4096 


4096 


1024 


8000 


16000 


A2 287-ffis 


16000 


128 


4096 


>2048 | 


.512 


16000 


>8000 


A3 287-ffis 


16000 


128 


4096 


>2048 


512 


16000 


>8000 


A4 287-His 


64000 


128 


4096 


64000 


1024 


64000 


32000 



The same high activity for the A4 deletion was seen using the sequence from strain MC58. 
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As well as showing superior expression characteristics, therefore, the mutants are 
immunologically equivalent or superior. 



Example IS - poly -glycine deletions 

The 'Al 287-His' construct of the previous example differs from 287-His and from 
* 287 untagged, Qnly by ft shQn N _ terminal deletion (GGGGGGS). Using an expression vector 
which replaces the deleted serine with a codon present in the Nhe cloning site, however, this 
amounts to a deletion only of (Gly) 6 . Thus, the deletion of this (Gly) 6 sequence has been 
shown to have a dramatic effect on protein expression. 

The protein lacking the N-terminal amino acids up to GGGGGG is called 'AG 287'. In strain 
MC58, its sequence (leader peptide underlined) is: 

* AG287 

1 MFKRSVIAMA CIFALSAC GG GGGGSPDVKS ADTLSKPAAP WSEKETEAK 

51 EDAPQAGSQG QGAPSAQGSQ DMAAVSEENT GNGGAVTADN PKNEDEVAQN 

101 DMPQNAAGTD SSTPNHTPDP NMLAGNMENQ ATDAGESSQP ANQPDMANAA 

151 DGMQGDDPSA GGQNAGNTAA QGANQAGNNQ AAGSSDPIPA SNPAPANGGS 

201 NFGRVDLANG VLIDGPSQNI TLTHCKGDSC SGNNFLDEEV QLKSEFEKLS 

251 DADKISNYKK DGKNDKFVGL VADSVQMKGI NQYIIFYKPK PTSFARFRRS 

301 ARSRRSLPAE MPLIPVNQAD TLIVDGEAVS LTGHSGNIFA PEGNYRYLTY 

351 GAEKLPGGSY ALRVQGEPAK GEMLAGAAVY NGEVLHFHTE NGRPYPTRGR 

401 FAAKVDFGSK SVDGIIDSGD DLHMGTQKFK AAIDGNGFKG TWTENGSGDV 

451 SGKFYGPAGE EVAGKYSYRP TDAEKGGFGV FAGKKEQD* 

AG287, with or without His-tag ('AG287-His' and 'AG287K', respectively), are expressed at 
very good levels in comparison with the '287-His' or '287 unta «s ed \ 

On the basis of gene variability data, variants of AG287-His were expressed in Kcoli fiom a 
number of MenB strains, in particular from strains 2996, MC58, 1000, and BZ232. The 
results were also good. 

It was hypothesised that poly-Gly deletion might be a general strategy to improve 
expression. Other MenB lipoproteins containing similar (Gly)„ motifs (near the N-terminus, 
downstream of a cysteine) were therefore identified, namely Tbp2 (NMB0460), 741 (NMB 
1870) and 983 (NMB 1969): 

TBP2 AGTbp2 

1 MNNPLVNQAA MVLPVFLLSA CLGGGGSFDL DSVDTEAPRP APKYQDVFSE 

51 KPQAQKDQGG YGFAMRLKRR NWYPQAKEDE VKLDESDWEA TGLPDEPKEL 

101 PKRQKSVIEK VETDSDNNIY SSPYLKPSNH QNGNTGNGIN QPKNQAKDYE 

151 NFKYVYSGWF YKHAKREFNL KVEPKSAKNG DDGYIFYHGK EPSRQLPASG 

201 KITYKGVWHF ATDTKKGQKF REIIQPSKSQ GDRYSGFSGD DGEEYSNKNK 

251 STLTDGQEGY GFTSNLEVDF HNKKLTGKLI RNNANTDNNQ ATTTQYYSLE 

301 AQVTGNRFNG KATATDKPQQ NSETKEHPFV SDSSSLSGGF FGPQGEELGF 

351 RFLSDDQKVA WGSAKTKDK PANGNTAAAS GGTDAAASNG AAGTSSENGK 

401 LTTVLDAVEL KLGDKEVQKL DNFSNAAQLV VDGIMIPLLP EASESGNNQA 

451 NQGTNGGTAF TRKFDHTPES DKKDAQAGTQ TNGAQTASNT AGDTNGKTKT 
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501 YEVEVCCSNL NYLKYGMLTR KNSRSAMQAG ESSSQADAKT EQVEQSMFLQ 

551 GBRTDEKEIP SEQNIVYRGS WYGYIANDKS TSWSGNASNA TSGNRAEFTV 

601 NFADKKITGT LTADNRQEAT FTIDGNIKDN GFEGTAKTAE SGFDLDQSNT 

651 TRTPKAYITD AKVQGGFYGP KAEBLGGWFA YPGDKQTKNA TNASGNSSAT 

701 WPGAKRQQP VR* 

741 m> AG741 

1 VNRTAFCCL3 LTTALILTAC SSGGGGVAAD IGAGLADALT APLDHKDKGL. 

51 QSLTLDQSVR KNEKLKLAAQ GAEKTYGNGD SLNTGKLKND KVSRFDFIRQ 

101 IEVDGQLITL ESGEFQVYKQ SHSALTAFQT EQIQDSEHSG KMVAKRQFRI 

151 GDIAGEHTSF DKLPEGGRAT YRGTAFGSDD AGGKLTYTID FAAKQGNGKI 

201 EHLKSPELNV DLAAADIKPD GKRHAVISGS VLYNQAEKGS YSLGIFGGKA 

251 QEVAGSAEVK TVNGIRHIGL AAKQ* 



98 3 AG983 

1 MRTTPTFPTK TFKPTAMALA VATTL3AC LG GGGGGTSAPD FNAGGTGIGS 

51 NSRATTAKSA AVSYAGIKNE MCKDRSMLCA GRDDVAVTDR DAKINAPPPN 

101 LHTGDFPNPN DAYKNLINLK PAXEAGYTGR GVEVGIVDTG ESVGSISFPE 

151 LYGRKEHGYN ENYKNYTAYM RKEAPEDGGG KDIEASFDDE AVIETEAKPT 

201 DIRHVKEIGH IDLVSHIIGG RSVDGRPAGG IAPDATLHIM NTNDETKNEM 

251 MVAAIRNAWV KLGERGVRIV NNSFGTTSRA GTADLFQIAN SEEQYRQALL 

301 DYSGGDKTDE GIRLMQQSDY GNLSYHIRNK NMLFXFSTGN DAQAQPNTYA 

351 LLPFYEKDAQ KGIITVAGVD RSGEKFKREM YGEPGTBPLE YGSNHCGITA 

401 MWCLSAPYEA SVRFTRTNPI QIAGTSFSAP rVTGTAALLL QKYPWMSNDN 

451 LRTTIiLTTAQ DIGAVGVDSK FGWGLLDAGK AMNGPASFPF GDFTADTKGT 

501 SDIAYSFRND ISGTGGLIKK GGSQLQLHGN NTYTGKTIIE GGSLVLYGNN 

551 KSDMRVETKG ALIYNGAASG GSLNSDGIVY LADTDQSGAN ETVHIKGSLQ 

601 LDGKGTLYTR LGKLLKVDGT AIIGGKLYMS ARGKGAGYLN STGRRVPFLS 

651 AAKIGQDYSF FTNIETDGGL LASLDSVEKT AGSEGDTLSY YVRRGNAART 

701 ASAAAHSAPA GLKHAVEQGG SNLENLMVEL DASESSATPE TVETAAADRT 

751 DMPGIRPYGA TFRAAAAVQH ANAADGVRIF NSLAATVYAD STAAHADMQG 

801 RRIiKAVSDGL DHNGTGLRVI AQTQQDGGTW EQGGVEGKMR GSTQTVGIAA 

851 KTGENTTAAA TLGMGRSTWS ENSANAKTDS ISLFAGIRHD AGDIGYLKGL 

901 FSYGRYKNSI SRSTGADEHA EGSVNGTLMQ LGALGGVNVP FAATGDLTVE 

951 GGLRYDLLKQ DAFAEKGSAL GWSGNSLTEG TLVGIAGLKL SQPLiSDKAVL 

1001 FATAGVERDL NGRDYTVTGG FTGATAATGK TGARNMPHTR LVAGLGADVE 

1051 FGNGWNGLAR YSYAGSKQYG NHSGRVGVGY RF* 

Tbp2 and 741 genes were from strain MC58; 983 and 287 genes were from strain 2996. 
These were cloned in pET vector and expressed in E.coli without the sequence coding for 
their leader peptides or as "AG forms", both fused to a C-terminal His-tag. In each case, the 
same effect was seen - expression was good in the clones carrying the deletion of the 
poly-glycine stretch, and poor or absent if the glycines were present in the expressed protein: 
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AG287-His(2996) 
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AG287K(2996) 




+ 
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4- 
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T 
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+ 
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liU 
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4. 
i 
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741-His(MC58) 
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nd 


nd 


AG741-His(MC58) 


+ 


+ 




983-His (2996) 








AG983-His (2996) 


+ 







SDS-PAGE of the proteins is shown in Figure 13. 



AG287and hybrids 

AG287 proteins were made and purified for strains MC58, 1000 and BZ232. Each of these 
gave high ELISA titres and also serum bactericidal titres of >8192. AG287K, expressed from 
pET-24b, gave excellent titres in ELISA and the serum bactericidal assay. 
AG287-ORF46.1K may also be expressed in pET-24b. 

AG287 was also fused direcdy in-frame upstream of 919, 953, 961 (sequences shown below) 
andORF46.1: 



AG287-919 

1 ATGGCTAGCC CCGATGTTAA ATCGGCGGAC ACGCTGTCAA AACCGGCCGC 

51 TCCTGTTGTT GCTGAAAAAG AGACAGAGGT AAAAGAAGAT GCGCCACAGG 

101 CAGGTTCTCA AGGACAGGGC GCGCCATCCA CACAAGGCAG CCAAGATATG 

151 GCGGCAGTTT CGGCAGAAAA TACAGGCAAT GGCGGTGCGG CAACAACGGA 

201 CAAACCCAAA AATGAAGACG AGGGACCGCA AAATGATATG CCGCAAAATT 

251 CCGCCGAATC CGCAAATCAA ACAGGGAACA ACCAACCCGC CGATTCTTCA 

301 GATTCCGCCC CCGCGTCAAA CCCTGCACCT GCGAATGGCG GTAGCAATTT 

351 TGGAAGGGTT GATTTGGCTA ATGGCGTTTT GATTGATGGG CCGTCGCAAA 

401 ATATAACGTT GACCCACTGT AAAGGCGATT CTTGTAATGG TGATAATTTA 

451 TTGGATGAAG AAGCACCGTC AAAATCAGAA TTTGAAAATT TAAATGAGTC 

501 TGAACGAATT GAGAAATATA AGAAAGATGG GAAAAGCGAT AAATTTACTA 

551 ATTTGGTTGC GACAGCAGTT CAAGCTAATG GAACTAACAA ATATGTCATC 

601 ATTTATAAAG ACAAGTCCGC TTCATCTTCA TCTGCGCGAT TCAGGCGTTC 

651 TGCACGGTCG AGGAGGTCGC TTCCTGCCGA GATGCCGCTA ATCCCCGTCA 

701 ATCAGGCGGA TACGCTGATT GTCGATGGGG AAGCGGTCAG CCTGACGGGG 

751 CATTCCGGCA ATATCTTCGC GCCCGAAGGG AATTACCGGT ATCTGACTTA 

801 CGGGGCGGAA AAATTGCCCG GCGGATCGTA TGCCCTCCGT GTGCAAGGCG 

851 AACCGGCAAA AGGCGAAATG CTTGCTGGCA CGGCCGTGTA CAACGGCGAA 

901 GTGCTGCATT TTCATACGGA AAACGGCCGT CCGTACCCGA CTAGAGGCAG 

951 GTTTGCCGCA AAAGTCGATT TCGGCAGCAA ATCTGTGGAC GGCATTATCG 

1001 ACAGCGGCGA TGATTTGCAT ATGGGTACGC AAAAATTCAA AGCCGCCATC 
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10 



15 



20 



25 



30 



1051 
1101 
1151 
1201 
1251 
1301 
1351 
1401 
1451 
1501 
1551 
1601 
1651 
1701 
1751 
1801 
1851 
1901 
1951 
2001 
2051 
2101 
2151 
2201 
2251 
2301 
2351 
2401 
2451 
2501 



GATGGAAACG 
TTCCGGAAGG 
GCTATCGCCC 
AAAAAAGAGC 
CCAAACCTTT 
CGGTCGGCAT 
GTCTATACCG 
TTTCGCCAAA 
ACCGCCAAGG 
CATTCCTTTC 
GGTTGCAGGC 
CGGTGCTGAA 
TACGGTATTC 
GAGCGGAAAA 
CAATCGACAA 
ATCACCGCGC 
CCTCCCCTAC 
AAGCCCCGAT 
CACATCCAAG 
CATCGGCTAT 
ATATGGCGGA 
ATCAAAGCCT 
TCAAAACCCC 
ACGGTCCCGT 
GCAGTCGACC 
CGCCCATCCG 
ATACCGGCAG 
TACGGCGACG 
CGTCTGGCAG 
TCGAG 



GCTTTAAGGG 
TTTTACGGCC 
GACAGATGCG 
AGGATGGATC 
CCGCAACCCG 
CCCCGACCCC 
TTGTACCGCA 
AGCCTGCAAT 
CTGGCAGGAT 
AGGCAAAACA 
AACGGAAGCC 
GGGCGACGAC 
CCGACGATTT 
GCCCTTGTCC 
TACCGGCGGC 
GCACAACGGC 
CACACGCGCA 
ACTCGGTTAC 
GCTCGGGCCG 
GCCGACAAAA 
CAAAGGCTAC 
ATATGCGGCA 
AGCTATATCT 
CGGCGCACTG 
GGCACTACAT 
GTTACCCGCA 
CGCGATTAAA 
AAGCCGGCGA 
CTCCTACCCA 



GACTTGGACG 
CGGCCGGCGA 
GAAAAGGGCG 
CGGAGGAGGA 
ACACATCCGT 
GCCGGAACGA 
CCTGTCCCTG 
CCTTCCGCCT 
GTGTGCGCCC 
GTTTTTTGAA 
TTGCCGGTAC 
AGGCGGACGG 
TATCTCCGTC 
GCATCAGGCA 
ACACATACCG 
AATCAAAGGC 
ACCAAATCAA 
GCCGAAGACC 
TCTGAAAACC 
ACGAACATCC 
CTCAAGCTCG 
AAATCCGCAA 
TTTTCCGCGA 
GGCACGCCGT 
TACCTTGGGC 
AAGCCCTCAA 
GGCGCGGTGC 
ACTTGCCGGC 
ACGGTATGAA 



GAAAATGGCG 
GGAAGTGGCG 
GATTCGGCGT 
GGATGCCAAA 
CATCAACGGC 
CGGTCGGCGG 
CCCCACTGGG 
CGGCTGCGCC 
AAGCCTTTCA 
CGCTATTTCA 
GGTTACCGGC 
CACAAGCCCG 
CCCCTGCCTG 
GACGGGAAAA 
CCGACCTCTC 
AGGTTTGAAG 
CGGCGGCGCG 
CCGTCGAACT 
CCGTCCGGCA 
CTACGTTTCC 
GGCAGACCTC 
CGCCTCGCCG 
GCTTGCCGGA 
TGATGGGGGA 
GCGCCCTTAT 
CCGCCTGATT 
GCGTGGATTA 
AAACAGAAAA 
GCCCGAATAC 



GCGGGGATGT 
GGAAAATACA 
GTTTGCCGGC 
GCAAGAGCAT 
CCGGACCGGC 
CGGCGGGGCC 
CGGCGCAGGA 
AATTTGAAAA 
AACCCCCGTC 
CGCCGTGGCA 
TATTACGAGC 
CTTCCCGATT 
CCGGTTTGCG 
AACAGCGGCA 
CCGATTCCCC 
GAAGCCGCTT 
CTTGACGGCA 
TTTTTTTATG 
AATACATCCG 
ATCGGACGCT 
GATGCAGGGC 
AAGTTTTGGG 
AGCAGCAATG 
ATATGCCGGC 
TTGTCGCCAC 
ATGGCGCAGG 
TTTTTGGGGA 
CCACGGGTTA 
CGCCCGTAAC 



35 



40 



45 



50 



55 



60 



65 



l 

51 
101 
151 
201 
251 
301 
351 
401 
451 
501 
551 
601 
651 
701 
751 
801 



MASPDVKSAD 
AAVSAENTGN 
DSAPASNPAP 
LDEEAPSKSE 
IYKDKSASSS 
HSGNIFAPEG 
VLHFHTENGR 
DGNGFKGTWT 
KKEQDGSGGG 
VYTWPHLSL 
HSFQAKQFFE 
YGIPDDFISV 
ITARTTAIKG 
HIQGSGRLKT 
IKAYMRQNPQ 
AVDRHYITLG 
YGDEAGELAG 



AG287-953 



1 
51 
101 
151 
201 
251 
301 
351 
401 
451 
501 
551 
601 
651 
701 
751 



ATGGCTAGCC 
TCCTGTTGTT 
CAGGTTCTCA 
GCGGCAGTTT 
CAAACCCAAA 
CCGCCGAATC 
GATTCCGCCC 
TGGAAGGGTT 
ATATAACGTT 
TTGGATGAAG 
TGAACGAATT 
ATTTGGTTGC 
ATTTATAAAG 
TGCACGGTCG 
ATCAGGCGGA 
CATTCCGGCA 



TLSKPAAFW 
GGAATTDKPK 
ANGGSNFGRV 
FENLNESERI 
SARFRRSARS 
NYRYLTYGAE 
PYPTRGRFAA 
ENGGGDVSGR 
GCQSKSIQTF 
PHWAAQDFAK 
RYFTPWQVAG 
PLPAGLRSGK 
RFEGSRFLPY 
PSGKYIRIGY 
RLAEVLGQNP 
APLFVATAHP 
KQKTTGYVWQ 



CCGATGTTAA 
GCTGAAAAAG 
AGGACAGGGC 
CGGCAGAAAA 
AATGAAGACG 
CGCAAATCAA 
CCGCGTCAAA 
GATTTGGCTA 
GACCCACTGT 
AAGCACCGTC 
GAGAAATATA 
GACAGCAGTT 
ACAAGTCCGC 
AGGAGGTCGC 
TACGCTGATT 
ATATCTTCGC 



AEKETEVKED 
NEDEGPQNDM 
DLANGVLIDG 
EKYKKDGKSD 
RRSLPAEMPL 
KLPGGSYALR 
KVDFGSKSVD 
FYGPAGEEVA 
PQPDTSVING 
SLQSFRLGCA 
NGSLAGTVTG 
ALVRIRQTGK 
HTRNQXNGGA 
ADKNEHPYVS 
SYIFFRELAG 
VTRKALNRLI 
LLPNGMKPEY 



ATCGGCGGAC 
AGACAGAGGT 
GCGCCATCCA 
TACAGGCAAT 
AGGGACCGCA 
ACAGGGAACA 
CCCTGCACCT 
ATGGCGTTTT 
AAAGGCGATT 
AAAATCAGAA 
AGAAAGATGG 
CAAGCTAATG 
TTCATCTTCA 
TTCCTGCCGA 
GTCGATGGGG 
GCCCGAAGGG 



APQAGSQGQG 
PQNSAESANQ 
PSQNITLTHC 
KFTNLVATAV 
IPVNQADTLI 
VQGEPAKGEM 
GIIDSGDDLH 
GKYSYRPTDA 
PDRPVGIPDP 
NLKNRQGWQD 
YYEPVLKGDD 
NSGTIDNTGG 
LDGKAPILGY 
IGRYMADKGY 
SSNDGPVGAL 
MAQDTGSAIK 
RP* 



ACGCTGTCAA 
AAAAGAAGAT 
CACAAGGCAG 
GGCGGTGCGG 
AAATGATATG 
ACCAACCCGC 
GCGAATGGCG 
GATTGATGGG 
CTTGTAATGG 
TTTGAAAATT 
GAAAAGCGAT 
GAACTAACAA 
TCTGCGCGAT 
GATGCCGCTA 
AAGCGGTCAG 
AATTACCGGT 



APSTQGSQDM 
TGNNQPADSS 
KGDSCNGDNIi 
QANGTNKYVI 
VDGEAVSLTG 
LAGTAVYNGE 
MGTQKFKAAI 
EKGGFGVFAG 
AGTTVGGGGA 
VCAQAFQTPV 
RRTAQARFPI 
THTADLSRFP 
AEDPVELFFM 
LKLGQTSMQG 
GTPLMGEYAG 
GAVRVDYFWG 



AACCGGCCGC 
GCGCCACAGG 
CCAAGATATG 
CAACAACGGA 
CCGCAAAATT 
CGATTCTTCA 
GTAGCAATTT 
CCGTCGCAAA 
TGATAATTTA 
TAAATGAGTC 
AAATTTACTA 
ATATGTCATC 
TCAGGCGTTC 
ATCCCCGTCA 
CCTGACGGGG 
ATCTGACTTA 
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801 
851 
901 
951 
1001 
1051 
1101 
1151 
1201 
1251 
1301 
1351 
1401 
1451 
1501 
1551 
1601 
1651 
1701 

1 
51 
101 
151 
201 
251 
301 
351 
401 
451 
501 
551 



CGGGGCGGAA 
AACCGGCAAA 
GTGCTGCATT 
GTTTGCCGCA 
ACAGCGGCGA 
GATGGAAACG 
TTCCGGAAGG 
GCTATCGCCC 
AAAAAAGAGC 
CGAATATCAC 
CCAACGTCGG 
GCAAAACGCG 
AAGCGGTTCG 
ATGCCGCCCA 
AACGGCAAAA 
AACCGCCCCC 
CGATGGCGAA 
CGCACCAAAT 
CGTCCGCATC 

MASPDVKSAD 
AAVSAENTGN 
DSAPASNPAP 
LDEEAPSKSE 
1YKDKSASSS 
HSGNIFAPEG 
VLHFHTENGR 
DGNGFKGTWT 
KKEQDGSGGG 
AKRDGKIDIT 
NGKKLVSVDG 
RTKWGVDYLV 



AAATTGCCCG 
AGGCGAAATG 
TTCATACGGA 
AAAGTCGATT 
TGATTTGCAT 
GCTTTAAGGG 
TTTTACGGCC 
GACAGATGCG 
AGGATGGATC 
GCCAACGCCC 
CGGTTTTTAC 
ACGGTAAAAT 
CAACACTTTA 
ATATCCGGAC 
AACTGGTTTC 
GTCAAACTCA 
AACCGAAGTT 
GGGGCGTGGA 
GACATCCAAA 

TLSKPAAPW 
GGAATTDKPK 
ANGGSNFGRV 
FENLNESERI 
SARFRRSARS 
NYRYLTYGAE 
PYPTRGRFAA 
ENGGGDVSGR 
GATYKVDEYH 
IFVANLQSGS 
NLTMHGKTAP 
NVGMTKSVRI 



GCGGATCGTA 
CTTGCTGGCA 
AAACGGCCGT 
TCGGCAGCAA 
ATGGGTACGC 
GACTTGGACG 
CGGCCGGCGA 
GAAAAGGGCG 
CGGAGGAGGA 
GTTTCGCCAT 
GGTCTGACCG 
CGACATCACC 
CCGACCACCT 
ATCCGCTTTG 
CGTTGACGGC 
AAGCCGAAAA 
TGCGGCGGCG 
CTACCTCGTT 
TCGAGGCAGC 

AEKETEVKED 
NEDEGPQNDM 
DLANGVLIDG 
EKYKKDGKSD 
RRSLPAEMPL 
KLPGGSYALR 
KVDFGSKSVD 
FYGPAGEEVA 
ANARFAIDHF 
QHFTDHLKSA 
VKLKAEKFNC 
DIQIEAAKQ* 



TGCCCTCCGT 
CGGCCGTGTA 
CCGTACCCGA 
ATCTGTGGAC 
AAAAATTCAA 
GAAAATGGCG 
GGAAGTGGCG 
GATTCGGCGT 
GGAGCCACCT 
CGACCATTTC 
GTTCCGTCGA 
ATCCCCGTTG 
GAAATCAGCC 
TTTCCACCAA 
AACCTGACCA 
ATTCAACTGC 
ACTTCAGCAC 
AACGTTGGTA 
CAAACAATAA 

APQAGSQGQG 
PQNSAESANQ 
PSQNITLTHC 
KFTNLVATAV 
IPVNQADTLI 
VQGEPAKGEM 
GIIDSGDDLH 
GKYSYRPTDA 
NTSTNVGGFY 
DIFDAAQYPD 
YQSPMAKTEV 



GTGCAAGGCG 
CAACGGCGAA 
CTAGAGGCAG 
GGCATTATCG 
AGCCGCCATC 
GCGGGGATGT 
GGAAAATACA 
GTTTGCCGGC 
ACAAAGTGGA 
AACACCAGCA 
GTTCGACCAA 
CCAACCTGCA 
GACATCTTCG 
ATTCAACTTC 
TGCACGGCAA 
TACCAAAGCC 
CACCATCGAC 
TGACCAAAAG 
CTCGAG 

APSTQGSQDM 
TGNNQPADSS 
KGDSCNGDNL 
QANGTNKYVI 
VDGEAVSLTG 
LAGTAVYNGE 
MGTQKFKAAI 
EKGGFGVFAG 
GLTGSVEFDQ 
IRFVSTKFNF 
CGGDFSTTID 
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AQ287-961 



1 
51 
101 
151 
201 
251 
301 
351 
401 
451 
501 
551 
601 
651 
701 
751 
801 
851 
901 
951 
1001 
1051 
1101 
1151 
1201 
1251 
1301 
1351 
1401 
1451 
1501 
1551 



ATGGCTAGCC 
TCCTGTTGTT 
CAGGTTCTCA 
GCGGCAGTTT 
CAAACCCAAA 
CCGCCGAATC 
GATTCCGCCC 
TGGAAGGGTT 
ATATAACGTT 
TTGGATGAAG 
TGAACGAATT 
ATTTGGTTGC 
ATTTATAAAG 
TGCACGGTCG 
ATCAGGCGGA 
CATTCCGGCA 
CGGGGCGGAA 
AACCGGCAAA 
GTGCTGCATT 
GTTTGCCGCA 
ACAGCGGCGA 
GATGGAAACG 
TTCCGGAAGG 
GCTATCGCCC 
AAAAAAGAGC 
TGTTAAAAAA 
AAGAAATCAA 
GACGGCACAA 
CGACTTTAAA 
CCGTCAATGA 
TCTGAAATAG 
AGCAGATACT 



CCGATGTTAA 
GCTGAAAAAG 
AGGACAGGGC 
CGGCAGAAAA 
AATGAAGACG 
CGCAAATCAA 
CCGCGTCAAA 
GATTTGGCTA 
GACCCACTGT 
AAGCACCGTC 
GAGAAATATA 
GACAGCAGTT 
ACAAGTCCGC 
AGGAGGTCGC 
TACGCTGATT 
ATATCTTCGC 
AAATTGCCCG 
AGGCGAAATG 
TTCATACGGA 
AAAGTCGATT 
TGATTTGCAT 
GCTTTAAGGG 
TTTTACGGCC 
GACAGATGCG 
AGGATGGATC 
GCTGCCACTG 
CGGTTTCAAA 
TTACCAAAAA 
GGTCTGGGTC 
AAACAAACAA 
AAAAGTTAAC 
GATGCCGCTC 



ATCGGCGGAC 
AGACAGAGGT 
GCGCCATCCA 
TACAGGCAAT 
AGGGACCGCA 
ACAGGGAACA 
CCCTGCACCT 
ATGGCGTTTT 
AAAGGCGATT 
AAAATCAGAA 
AGAAAGATGG 
CAAGCTAATG 
TTCATCTTCA 
TTCCTGCCGA 
GTCGATGGGG 
GCCCGAAGGG 
GCGGATCGTA 
CTTGCTGGCA 
AAACGGCCGT 
TCGGCAGCAA 
ATGGGTACGC 
GACTTGGACG 
CGGCCGGCGA 
GAAAAGGGCG 
CGGAGGAGGA 
TGGCCATTGC 
GCTGGAGAGA 
AGACGCAACT 
TGAAAAAAGT 
AACGTCGATG 
AACCAAGTTA 
TGGATGCAAC 



ACGCTGTCAA 
AAAAGAAGAT 
CACAAGGCAG 
GGCGGTGCGG 
AAATGATATG 
ACCAACCCGC 
GCGAATGGCG 
GATTGATGGG 
CTTGTAATGG 
TTTGAAAATT 
GAAAAGCGAT 
GAACTAACAA 
TCTGCGCGAT 
GATGCCGCTA 
AAGCGGTCAG 
AATTACCGGT 
TGCCCTCCGT 
CGGCCGTGTA 
CCGTACCCGA 
ATCTGTGGAC 
AAAAATTCAA 
GAAAATGGCG 
GGAAGTGGCG 
GATTCGGCGT 
GGAGCCACAA 
TGCTGCCTAC 
CCATCTACGA 
GCAGCCGATG 
CGTGACTAAC 
CCAAAGTAAA 
GCAGACACTG 
CACCAACGCC 



AACCGGCCGC 
GCGCCACAGG 
CCAAGATATG 
CAACAACGGA 
CCGCAAAATT 
CGATTCTTCA 
GTAGCAATTT 
CCGTCGCAAA 
TGATAATTTA 
TAAATGAGTC 
AAATTTACTA 
ATATGTCATC 
TCAGGCGTTC 
ATCCCCGTCA 
CCTGACGGGG 
ATCTGACTTA 
GTGCAAGGCG 
CAACGGCGAA 
CTAGAGGCAG 
GGCATTATCG 
AGCCGCCATC 
GCGGGGATGT 
GGAAAATACA 
GTTTGCCGGC 
ACGACGACGA 
AACAATGGCC 
CATTGATGAA 
TTGAAGCCGA 
CTGACCAAAA 
AGCTGCAGAA 
ATGCCGCTTT 
TTGAATAAAT 
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1601 TGGGAGAAAA TATAACGACA TTTGCTGAAG AGACTAAGAC AAATATCGTA 

1651 AAAATTGATG AAAAATTAGA AGCCGTGGCT GATACCGTCG ACAAGCATGC 

1701 CGAAGCATTC AACGATATCG CCGATTCATT GGATGAAACC AACACTAAGG 

1751 CAGACGAAGC CGTCAAAACC GCCAATGAAG CCAAACAGAC GGCCGAAGAA 

1801 ACCAAACAAA ACGTCGATGC CAAAGTAAAA GCTGCAGAAA CTGCAGCAGG 

1851 CAAAGCCGAA GCTGCCGCTG GCACAGCTAA TACTGCAGCC GACAAGGCCG 

1901 AAGCTGTCGC TGCAAAAGTT ACCGACATCA AAGCTGATAT CGCTACGAAC 

1951 AAAGATAATA TTGCTAAAAA AGCAAACAGT GCCGACGTGT ACACCAGAGA 

2001 AGAGTCTGAC AGCAAATTTG TCAGAATTGA TGGTCTGAAC GCTACTACCG 

2051 AAAAATTGGA CACACGCTTG GCTTCTGCTG AAAAATCCAT TGCCGATCAC 

2101 GATACTCGCC TGAACGGTTT GGATAAAACA GTGTCAGACC TGCGCAAAGA 

2151 AACCCGCCAA GGCCTTGCAG AACAAGCCGC GCTCTCCGGT CTGTTCCAAC 

2201 CTTACAACGT GGGTCGGTTC AATGTAACGG CTGCAGTCGG CGGCTACAAA 

2251 TCCGAATCGG CAGTCGCCAT CGGTACCGGC TTCCGCTTTA CCGAAAACTT 

2301 TGCCGCCAAA GCAGGCGTGG CAGTCGGCAC TTCGTCCGGT TCTTCCGCAG 

2351 CCTACCATGT CGGCGTCAAT TACGAGTGGT AACTCGAG 

1 MASPDVKSAD TLSKPAAPW AEKETEVKED APQAGSQGQG APSTQGSQDM 

51 AAVSAENTGN GGAATTDKPK NEDEGPQNDM PQNSAESANQ TGNNQPADSS 

101 DSAPASNPAP ANGGSNPGRV DLANGVLIDG PSQNITLTHC KGDSCNGDNL 

151 LDEEAPSKSE FENLNESERI EKYKKDGKSD KFTNLVATAV QANGTNKYVI 

201 IYKDKSASSS SARFRRSARS RRSLPAEMPL IPVNQADTLI VDGEAVSLTG 

251 HSGNIFAPEG NYRYLTYGAE KLPGGSYALR VQGEPAKGEM LAGTAVYNGE 

301 VLHFHTENGR PYPTRGRFAA KVDFGSKSVD GIIDSGDDLH MGTQKFKAAI 

351 DGNGFKGTWT ENGGGDVSGR FYGPAGEEVA GKYSYRPTDA EKGGFGVFAG 

401 KKEQDGSGGG GATNDDDVKK AATVAXAAAY NNGQEINGFK AGETIYDIDE 

451 DGTITKKDAT AADVEADDFK GLGLKKWTN LTKTVNENKQ NVDAKVKAAE 

501 SEIEKLTTKL ADTDAALADT DAALDATTNA LNKLGENITT FAEETKTNIV 

551 KIDEKLEAVA DTVDKHAEAF NDIADSLDET NTKADEAVKT ANEAKQTAEE 

601 TKQNVDAKVK AAETAAGKAE AAAGTANTAA DKAEAVAAKV TDIKADIATN 

651 KDNIAKKANS ADVYTREESD SKFVRIDGLN ATTEKLDTRL ASAEKSIADH 

701 DTRLNGLDKT VSDLRKETRQ GLAEQAALSG LFQPYNVGRF NVTAAVGGYK 

751 SESAVAIGTG FRFTENFAAK AGVAVGTSSG SSAAYHVGVN YEW* 





ELISA 


Bactericidal 


AG287-953-His 


3834 


65536 


AG287-961-His 


108627 


65536 



The bactericidal efficacy (homologous strain) of antibodies raised against the hybrid proteins 
was compared with antibodies raised against simple mixtures of the component antigens 
(using 287-GST) for 919 and ORF46.1: 





Mixture with 287 


Hybrid with AG287 


919 


32000 


128000 


ORF46.1 


128 


16000 



Data for bactericidal activity against heterologous MenB strains and against serotypes A and 
C were also obtained: 
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919 


ORF46.1 


O Li 01 1 1 


Mixture 


Hybrid 


Mixture 


Hybrid 




1024 


32000 




16384 










D1Z 


BZ232 


512 


512 






MenA (F6124) 


512 


32000 




8192 


MenC (Cll) 


>2048 


>2048 






MenC (BZ133) 


>4096 


64000 




8192 



The hybrid proteins with AG287 at the N-terminus are therefore immunologically superior to 
simple mixtures, with AG287-ORF46.1 being particularly effective, even against 
heterologous strains. AG287-ORF46.1K may be expressed in pET-24b. 

The same hybrid proteins were made using New Zealand strain 394/98 rather than 2996: 



AG287NZ-919 



1 
51 
101 
151 
201 
251 
301 
351 
401 
451 
501 
551 
601 
651 
701 
751 
801 
851 
901 
951 
1001 
1051 
1101 
1151 
1201 
1251 
1301 
1351 
1401 
1451 
1501 
1551 
1601 
1651 
1701 
1751 
1801 
1851 
1901 
1951 
2001 



ATGGCTAGCC 

CCCTGTTGTT 

CAGGTTCTCA 

GCGGCGGTTT 

CAAACCCAAA 

CCGCCGATAC 

CCGGCCGGAA 

GCCGGCAAAC 

ACGATCCGTC 

ACAAATCAAG 

TTCAACCAAT 

ACGTGGGCAA 

ACCCACTGTA 

AGTACAGCTA 

GTAATTACAA 
GGTTTGGTTG 

CTTTTATAAA 
GGTCGAGGCG 
GCGGATACGC 
CGGCAATATC 
CGGAAAAATT 
TCAAAAGGCG 
GCATTTTCAT 
CCGCAAAAGT 
GGCGATGGTT 
AAACGGCTTT 
GAAAGTTTTA 
CGCCCAACAG 
AGAGCAGGAT 
CCTTTCCGCA 
GGCATCCCCG 
TACCGTTGTA 
CCAAAAGCCT 
CAAGGCTGGC 
CTTTCAGGCA 
CAGGCAACGG 
CTGAAGGGCG 
TATTCCCGAC 
GAAAAGCCCT 
GACAATACCG 
CGCGCGCACA 



CCGATGTCAA 
TCTGAAAAAG 
AGGACAGGGC 
CGGAAGAAAA 
AATGAAGACG 
AGATAGTTTG 
ATATGGAAAA 
CAACCGGATA 
GGCAGGCGGG 
CCGAAAACAA 
CCTAGCGCCA 
TTCTGTTGTG 
AAGGCGATTC 
AAATCAGAAT 
GAAAGATGGG 
CCGATAGTGT 
CCTAAACCCA 
GTCGCTTCCG 
TGATTGTCGA 
TTCGCGCCCG 
GCCCGGCGGA 
AAATGCTCGC 
ACGGAAAACG 
CGATTTCGGC 
TGCATATGGG 
AAGGGGACTT 
CGGCCCGGCC 
ATGCGGAAAA 
GGATCCGGAG 
ACCCGACACA 
ACCCCGCCGG 
CCGCACCTGT 
GCAATCCTTC 
AGGATGTGTG 
AAACAGTTTT 
AAGCCTTGCC 
ACGACAGGCG 
GATTTTATCT 
TGTCCGCATC 
GCGGCACACA 
ACGGCAATCA 



GTCGGCGGAC 
AGACAGAGGC 
GCGCCATCCG 
TACAGGCAAT 
AGGGGGCGCA 
ACACCGAATC 
CCAAGCACCG 
TGGCAAATAC 
GAAAATGCCG 
TCAAACCGCC 
CGAATAGCGG 
ATTGACGGGC 
TTGTAGTGGC 
TTGAAAAATT 
AAGAATGACG 
GCAGATGAAG 
CTTCATTTGC 
GCCGAGATGC 
TGGGGAAGCG 
AAGGGAATTA 
TCGTATGCCC 
GGGCACGGCA 
GCCGTCCGTC 
AGCAAATCTG 
TACGCAAAAA 
GGACGGAAAA 
GGCGAGGAAG 
GGGCGGATTC 
GAGGAGGATG 
TCCGTCATCA 
AACGACGGTC 
CCCTGCCCCA 
CGCCTCGGCT 
CGCCCAAGCC 
TTGAACGCTA 
GGTACGGTTA 
GACGGCACAA 
CCGTCCCCCT 
AGGCAGACGG 
TACCGCCGAC 
AAGGCAGGTT 



ACGCTGTCAA 
AAAGGAAGAT 
CACAAGGCGG 
GGCGGTGCGG 
AAATGATATG 
ACACCCCGGC 
GATGCCGGGG 
GGCGGACGGA 
GCAATACGGC 
GGTTCTCAAA 
TGGTGATTTT 
CGTCGCAAAA 
AATAATTTCT 
AAGTGATGCA 
GGAAGAATGA 
GGAATCAATC 
GCGATTTAGG 
CGCTGATTCC 
GTCAGCCTGA 
CCGGTATCTG 
TCCGTGTTCA 
GTGTACAACG 
CCCGTCCAGA 
TGGACGGCAT 
TTCAAAGCCG 
TGGCGGCGGG 
TGGCGGGAAA 
GGCGTGTTTG 
CCAAAGCAAG 
ACGGCCCGGA 
GGCGGCGGCG 
CTGGGCGGCG 
GCGCCAATTT 
TTTCAAACCC 
TTTCACGCCG 
CCGGCTATTA 
GCCCGCTTCC 
GCCTGCCGGT 
GAAAAAACAG 
CTCTCCCGAT 
TGAAGGAAGC 



AACCTGCCGC 
GCGCCACAGG 
TCAAGATATG 
CAGCAACGGA 
CCGCAAAATG 
TTCGAATATG 
AATCGGAGCA 
ATGCAGGGTG 
TGCCCAAGGT 
ATCCTGCCTC 
GGAAGGACGA 
TATAACGTTG 
TGGATGAAGA 
GACAAAATAA 
TAAATTTGTC 
AATATATTAT 
CGTTCTGCAC 
CGTCAATCAG 
CGGGGCATTC 
ACTTACGGGG 
AGGCGAACCT 
GCGAAGTGCT 
GGCAGGTTTG 
TATCGACAGC 
CCATCGATGG 
GATCTTTCCG 
ATACAGCTAT 
CCGGCAAAAA 
AGCATCCAAA 
CCGGCCGGTC 
GGGCCGTCTA 
CAGGATTTCG 
GAAAAACCGC 
CCGTCCATTC 
TGGCAGGTTG 
CGAGCCGGTG 
CGATTTACGG 
TTGCGGAGCG 
CGGCACAATC 
TCCCCATCAC 
CGCTTCCTCC 
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10 



15 



20 



25 



30 



2051 
2101 
2151 
2201 
2251 
2301 
2351 
2401 
2451 
2501 
2551 
2601 
2651 

1 
51 
101 
151 
201 
251 
301 
351 
401 
451 
501 
551 
601 
651 
701 
751 
801 
851 



CCTACCACAC 
CCGATACTCG 
CCAAGGCTCG 
GCTATGCCGA 
GCGGACAAAG 
AGCCTATATG 
ACCCCAGCTA 
CCCGTCGGCG 
CGACCGGCAC 
ATCCGGTTAC 
GGCAGCGCGA 
CGACGAAGCC 
GGCAGCTCCT 

MASPDVKSAD 
AAVSEENTGN 
PAGNMENQAP 
TNQABNNQTA 
THCKGDSCSG 
GLVADSVQMK 
ADTLIVDGEA 
SKGEMLAGTA 
GDGLHMGTQK 
RPTDAEKGGF 
GIPDPAGTTV 
QGWQDVCAQA 
LKGDDRRTAQ 
DNTGGTHTAD 
PILGYAEDPV 
ADKGYLKLGQ 
PVGALGTPLM 
GSAIKGAVRV 



GCGCAACCAA 
GTTACGCCGA 
GGCCGTCTGA 
CAAAAACGAA 
GCTACCTCAA 
CGGCAAAATC 
TATCTTTTTC 
CACTGGGCAC 
TACATTACCT 
CCGCAAAGCC 
TTAAAGGCGC 
GGCGAACTTG 
ACCCAACGGT 

TLSKPAAPW 
GGAAATDKPK 
DAGESEQPAN 
GSQNPASSTN 
NNFLDEEVQL 
GINQYIIFYK 
VSLTGHSGNI 
VYNGEVLHFH 
FKAAIDGNGF 
GVFAGKKEQD 
GGGGAVYTW 
FQTPVHSFQA 
ARFPIYGIPD 
LSRFPITART 
ELFFMHIQGS 
TSMQGIKAYM 
GEYAGAVDRH 
DYFWGYGDEA 



ATCAACGGCG 
AGACCCCGTC 
AAACCCCGTC 
CATCCCTACG 
GCTCGGGCAG 
CGCAACGCCT 
CGCGAGCTTG 
GCCGTTGATG 
TGGGCGCGCC 
CTCAACCGCC 
GGTGCGCGTG 
CCGGCAAACA 
ATGAAGCCCG 

SEKETEAKED 
NEDEGAQNDM 
QPDMANTADG 
PSATNSGGDF 
KSEFEKLSDA 
PKPTSFARFR 
FAPEGNYRYL 
TENGRPSPSR 
KGTWTENGGG 
GSGGGGCQSK 
PHLSLPHWAA 
KQFFERYFTP 
DFISVPLPAG 
TAIKGRFEGS 
GRLKTPSGKY 
RQNPQRLAEV 
YITLGAPLFV 
GELAGKQKTT 



GCGCGCTTGA 
GAACTTTTTT 
CGGCAAATAC 
TTTCCATCGG 
ACCTCGATGC 
CGCCGAAGTT 
CCGGAAGCAG 
GGGGAATATG 
CTTATTTGTC 
TGATTATGGC 
GATTATTTTT 
GAAAACCACG 
AATACCGCCC 

APQAGSQGQG 
PQNAADTDSL 
MQGDDPSAGG 
GRTNVGNSW 
DKISNYKKDG 
RSARSRRSLP 
TYGAEKLPGG 
GRFAAKVDFG 
DVSGKFYGPA 
SIQTFPQPDT 
QDFAKSLQSF 
WQVAGNGSLA 
LRSGKALVRI 
RFLPYHTRNQ 
IRIGYADKNE 
LGQNPSYIFF 
ATAHPVTRKA 
GYVWQLLPNG 



CGGCAAAGCC 
TTATGCACAT 
ATCCGCATCG 
ACGCTATATG 
AGGGCATCAA 
TTGGGTCAAA 
CAATGACGGT 
CCGGCGCAGT 
GCCACCGCCC 
GCAGGATACC 
GGGGATACGG 
GGTTACGTCT 
CTAAAAGCTT 

APSAQGGQDM 
TPNHTPASNM 
ENAGNTAAQG 
IDGPSQNITL 
KNDGKNDKFV 
AEMPLIPVNQ 
SYALRVQGEP 
SKSVDGIIDS 
GEEVAGKYSY 
SVINGPDRFV 
RLGCANLKNR 
GTVTGYYEPV 
RQTGKNSGTI 
INGGALDGKA 
HPYVSIGRYM 
RELAGSSNDG 
LNRLIMAQDT 
MKPEYRP* 
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AG287NZ-953 



1 
51 
101 
151 
201 
251 
301 
351 
401 
451 
501 
551 
601 
651 
701 
751 
801 
851 
901 
951 
1001 
1051 
1101 
1151 
1201 
1251 
1301 
1351 
1401 
1451 
1501 
1551 



ATGGCTAGCC 
CCCTGTTGTT 
CAGGTTCTCA 
GCGGCGGTTT 
CAAACCCAAA 
CCGCCGATAC 
CCGGCCGGAA 
GCCGGCAAAC 
ACGATCCGTC 
ACAAATCAAG 
TTCAACCAAT 
ACGTGGGCAA 
ACCCACTGTA 
AGTACAGCTA 
GTAATTACAA 
GGTTTGGTTG 
CTTTTATAAA 
GGTCGAGGCG 
GCGGATACGC 
CGGCAATATC 
CGGAAAAATT 
TCAAAAGGCG 
GCATTTTCAT 
CCGCAAAAGT 
GGCGATGGTT 
AAACGGCTTT 
GAAAGTTTTA 
CGCCCAACAG 
AGAGCAGGAT 
ATCACGCCAA 
GTCGGCGGTT 
ACGCGACGGT 



CCGATGTCAA 
TCTGAAAAAG 
AGGACAGGGC 
CGGAAGAAAA 
AATGAAGACG 
AGATAGTTTG 
ATATGGAAAA 
CAACCGGATA 
GGCAGGCGGG 
CCGAAAACAA 
CCTAGCGCCA 
TTCTGTTGTG 
AAGGCGATTC 
AAATCAGAAT 
GAAAGATGGG 
CCGATAGTGT 
CCTAAACCCA 
GTCGCTTCCG 
TGATTGTCGA 
TTCGCGCCCG 
GCCCGGCGGA 
AAATGCTCGC 
ACGGAAAACG 
CGATTTCGGC 
TGCATATGGG 
AAGGGGACTT 
CGGCCCGGCC 
ATGCGGAAAA 
GGATCCGGAG 
CGCCCGTTTC 
TTTACGGTCT 
AAAATCGACA 



GTCGGCGGAC 
AGACAGAGGC 
GCGCCATCCG 
TACAGGCAAT 
AGGGGGCGCA 
ACACCGAATC 
CCAAGCACCG 
TGGCAAATAC 
GAAAATGCCG 
TCAAACCGCC 
CGAATAGCGG 
ATTGACGGGC 
TTGTAGTGGC 
TTGAAAAATT 
AAGAATGACG 
GCAGATGAAG 
CTTCATTTGC 
GCCGAGATGC 
TGGGGAAGCG 
AAGGGAATTA 
TCGTATGCCC 
GGGCACGGCA 
GCCGTCCGTC 
AGCAAATCTG 
TACGCAAAAA 
GGACGGAAAA 
GGCGAGGAAG 
GGGCGGATTC 
GAGGAGGAGC 
GCCATCGACC 
GACCGGTTCC 
TCACCATCCC 



ACGCTGTCAA 
AAAGGAAGAT 
CACAAGGCGG 
GGCGGTGCGG 
AAATGATATG 
ACACCCCGGC 
GATGCCGGGG 
GGCGGACGGA 
GCAATACGGC 
GGTTCTCAAA 
TGGTGATTTT 
CGTCGCAAAA 
AATAATTTCT 
AAGTGATGCA 
GGAAGAATGA 
GGAATCAATC 
GCGATTTAGG 
CGCTGATTCC 
GTCAGCCTGA 
CCGGTATCTG 
TCCGTGTTCA 
GTGTACAACG 
CCCGTCCAGA 
TGGACGGCAT 
TTCAAAGCCG 
TGGCGGCGGG 
TGGCGGGAAA 
GGCGTGTTTG 
CACCTACAAA 
ATTTCAACAC 
GTCGAGTTCG 
CGTTGCCAAC 



AACCTGCCGC 
GCGCCACAGG 
TCAAGATATG 
CAGCAACGGA 
CCGCAAAATG 
TTCGAATATG 
AATCGGAGCA 
ATGCAGGGTG 
TGCCCAAGGT 
ATCCTGCCTC 
GGAAGGACGA 
TATAACGTTG 
TGGATGAAGA 
GACAAAATAA 
TAAATTTGTC 
AATATATTAT 
CGTTCTGCAC 
CGTCAATCAG 
CGGGGCATTC 
ACTTACGGGG 
AGGCGAACCT 
GCGAAGTGCT 
GGCAGGTTTG 
TATCGACAGC 
CCATCGATGG 
GATGTTTCCG 
ATACAGCTAT 
CCGGCAAAAA 
GTGGACGAAT 
CAGCACCAAC 
ACCAAGCAAA 
CTGCAAAGCG 
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1601 
1651 
1701 
1751 
1801 
1851 
1901 



GTTCGCAACA 
GCCCAATATC 
CAAAAAACTG 
CCCCCGTCAA 
GCGAAAACCG 
CAAATGGGGC 
GCATCGACAT 



CTTTACCGAC 
CGGACATCCG 
GTTTCCGTTG 
ACTCAAAGCC 
AAGTTTGCGG 
GTGGACTACC 
CCAAATCGAG 



CACCTGAAAT 
CTTTGTTTCC 
ACGGCAACCT 
GAAAAATTCA 
CGGCGACTTC 
TCGTTAACGT 
GCAGCCAAAC 



CAGCCGACAT 
ACCAAATTCA 
GACCATGCAC 
ACTGCTACCA 
AGCACCACCA 
TGGTATGACC 
AATAAAAGCT 



CTTCGATGCC 
ACTTCAACGG 
GGCAAAACCG 
AAGCCCGATG 
TCGACCGCAC 
AAAAGCGTCC 
T 



10 



15 



20 



l 

51 
101 
151 
201 
251 
301 
351 
401 
451 
501 
551 
601 



MASPDVKSAD 
AAVSEENTGN 
PAGNMENQAP 
TNQAENNQTA 
THCKGDSCSG 
GLVADSVQMK 
ADTLIVDGEA 
SKGEMLAGTA 
GDGLHMGTQK 
RPTDAEKGGF 
VGGFYGLTGS 
AQYPDIRFVS 
AKTEVCGGDF 



TLSKPAAFW 
GGAAATDKPK 
DAGESEQPAN 
GSQNPASSTN 
NNFLDEEVQL 
GINQYIIFYK 
VSLTGHSGNI 
VYNGEVLHFH 
FKAAIDGNGF 
GVFAGKKEQD 
VEFDQAKRDG 
TKFNFNGKKL 
STTIDRTKWG 



SEKETEAKED 
NEDEGAQNDM 
QPDMANTADG 
PSATNSGGDF 
KSEFEKLSDA 
PKPTSFARFR 
FAPEGNYRYL 
TENGRPSPSR 
KGTWTENGGG 
GSGGGGATYK 
KIDITIPVAN 
VSVDGNLTMH 
VDYLVNVGMT 



APQAGSQGQG 
PQNAADTDSL 
MQGDDPSAGG 
GRTNVGNSW 
DKISNYKKDG 
RSARSRRSLP 
TYGAEKLPGG 
GRFAAKVDFG 
DVSGKFYGPA 
VDEYHANARF 
LQSGSQHFTD 
GKTAPVKLKA 
KSVRIDIQIE 



APSAQGGQDM 
TPNHTPASNM 
ENAGNTAAQG 
IDGPSQNITL 
KNDGKNDKFV 
AEMPLIFVNQ 
SYALKVQGEP 
SKSVDGIIDS 
GEEVAGKYSY 
AIDHFNTSTN 
HLKSADIFDA 
EKFNCYQSPM 
AAKQ* 
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AG287NZ-961 



1 
51 
101 
151 
201 
251 
301 
351 
401 
451 
501 
551 
601 
651 
701 
751 
801 
851 
901 
951 
1001 
1051 
1101 
1151 
1201 
1251 
1301 
1351 
1401 
1451 
1501 
1551 
1601 
1651 
1701 
1751 
1801 
1851 
1901 
1951 
2001 
2051 
2101 



ATGGCTAGCC 
CCCTGTTGTT 
CAGGTTCTCA 
GCGGCGGTTT 
CAAACCCAAA 
CCGCCGATAC 
CCGGCCGGAA 
GCCGGCAAAC 
ACGATCCGTC 
ACAAATCAAG 
TTCAACCAAT 
ACGTGGGCAA 
ACCCACTGTA 
AGTACAGCTA 
GTAATTACAA 
GGTTTGGTTG 
CTTTTATAAA 
GGTCGAGGCG 
GCGGATACGC 
CGGCAATATC 
CGGAAAAATT 
TCAAAAGGCG 
GCATTTTCAT 
CCGCAAAAGT 
GGCGATGGTT 
AAACGGCTTT 
GAAAGTTTTA 
CGCCCAACAG 
AGAGCAGGAT 
AAAAAGCTGC 
ATCAACGGTT 
CACAATTACC 
TTAAAGGTCT 
AATGAAAACA 
AATAGAAAAG 
ATACTGATGC 
GAAAATATAA 
TGATGAAAAA 
CATTCAACGA 
GAAGCCGTCA 
ACAAAACGTC 
CCGAAGCTGC 
GTCGCTGCAA 



CCGATGTCAA 
TCTGAAAAAG 
AGGACAGGGC 
CGGAAGAAAA 
AATGAAGACG 
AGATAGTTTG 
ATATGGAAAA 
CAACCGGATA 
GGCAGGCGGG 
CCGAAAACAA 
CCTAGCGCCA 
TTCTGTTGTG 
AAGGCGATTC 
AAATCAGAAT 
GAAAGATGGG 
CCGATAGTGT 
CCTAAACCCA 
GTCGCTTCCG 
TGATTGTCGA 
TTCGCGCCCG 
GCCCGGCGGA 
AAATGCTCGC 
ACGGAAAACG 
CGATTTCGGC 
TGCATATGGG 
AAGGGGACTT 
CGGCCCGGCC 
ATGCGGAAAA 
GGATCCGGAG 
CACTGTGGCC 
TCAAAGCTGG 
AAAAAAGACG 
GGGTCTGAAA 
AACAAAACGT 
TTAACAACCA 
CGCTCTGGAT 
CGACATTTGC 
TTAGAAGCCG 
TATCGCCGAT 
AAACCGCCAA 
GATGCCAAAG 
CGCTGGCACA 
AAGTTACCGA 



GTCGGCGGAC 
AGACAGAGGC 
GCGCCATCCG 
TACAGGCAAT 
AGGGGGCGCA 
ACACCGAATC 
CCAAGCACCG 
TGGCAAATAC 
GAAAATGCCG 
TCAAACCGCC 
CGAATAGCGG 
ATTGACGGGC 
TTGTAGTGGC 
TTGAAAAATT 
AAGAATGACG 
GCAGATGAAG 
CTTCATTTGC 
GCCGAGATGC 
TGGGGAAGCG 
AAGGGAATTA 
TCGTATGCCC 
GGGCACGGCA 
GCCGTCCGTC 
AGCAAATCTG 
TACGCAAAAA 
GGACGGAAAA 
GGCGAGGAAG 
GGGCGGATTC 
GAGGAGGAGC 
ATTGCTGCTG 
AGAGACCATC 
CAACTGCAGC 
AAAGTCGTGA 
CGATGCCAAA 
AGTTAGCAGA 
GCAACCACCA 
TGAAGAGACT 
TGGCTGATAC 
TCATTGGATG 
TGAAGCCAAA 
TAAAAGCTGC 
GCTAATACTG 
CATCAAAGCT 



ACGCTGTCAA 
AAAGGAAGAT 
CACAAGGCGG 
GGCGGTGCGG 
AAATGATATG 
ACACCCCGGC 
GATGCCGGGG 
GGCGGACGGA 
GCAATACGGC 
GGTTCTCAAA 
TGGTGATTTT 
CGTCGCAAAA 
AATAATTTCT 
AAGTGATGCA 
GGAAGAATGA 
GGAATCAATC 
GCGATTTAGG 
CGCTGATTCC 
GTCAGCCTGA 
CCGGTATCTG 
TCCGTGTTCA 
GTGTACAACG 
CCCGTCCAGA 
TGGACGGCAT 
TTCAAAGCCG 
TGGCGGCGGG 
TGGCGGGAAA 
GGCGTGTTTG 
CACAAACGAC 
CCTACAACAA 
TACGACATTG 
CGATGTTGAA 
CTAACCTGAC 
GTAAAAGCTG 
CACTGATGCC 
ACGCCTTGAA 
AAGACAAATA 
CGTCGACAAG 
AAACCAACAC 
CAGACGGCCG 
AGAAACTGCA 
CAGCCGACAA 
GATATCGCTA 



AACCTGCCGC 
GCGCCACAGG 
TCAAGATATG 
CAGCAACGGA 
CCGCAAAATG 
TTCGAATATG 
AATCGGAGCA 
ATGCAGGGTG 
TGCCCAAGGT 
ATCCTGCCTC 
GGAAGGACGA 
TATAACGTTG 
TGGATGAAGA 
GACAAAATAA 
TAAATTTGTC 
AATATATTAT 
CGTTCTGCAC 
CGTCAATCAG 
CGGGGCATTC 
ACTTACGGGG 
AGGCGAACCT 
GCGAAGTGCT 
GGCAGGTTTG 
TATCGACAGC 
CCATCGATGG 
GATGTTTCCG 
ATACAGCTAT 
CCGGCAAAAA 
GACGATGTTA 
TGGCCAAGAA 
ATGAAGACGG 
GCCGACGACT 
CAAAACCGTC 
CAGAATCTGA 
GCTTTAGCAG 
TAAATTGGGA 
TCGTAAAAAT 
CATGCCGAAG 
TAAGGCAGAC 
AAGAAACCAA 
GCAGGCAAAG 
GGCCGAAGCT 
CGAACAAAGA 
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2251 
2301 
2351 
2401 
2451 
2501 
2551 

1 
51 
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651 
701 
751 
801 
851 



TAATATTGCT 
CTGACAGCAA 
TTGGACACAC 
TCGCCTGAAC 
GCCAAGGCCT 
AACGTGGGTC 
ATCGGCAGTC 
CCAAAGCAGG 
CATGTCGGCG 

MASPDVKSAD 
AAVSEENTGN 
PAGNMENQAP 
TNQAENNQTA 
THCKGDSCSG 
GLVADSVQMK 
ADTLIVDGEA 
SKGEMLAGTA 
GDGLHMGTQK 
RPTDAEKGGF 
INGFKAGETI 
NENKQNVDAK 
ENITTFAEET 
EAVKTANEAK 
VAAKVTDIKA 
LDTRLASAEK 
NVGRFNVTAA 
HVGVNYEW* 



AAAAAAGCAA 
ATTTGTCAGA 
GCTTGGCTTC 
GGTTTGGATA 
TGCAGAACAA 
GGTTCAATGT 
GCCATCGGTA 
CGTGGCAGTC 
TCAATTACGA 

TLSKPAAPW 
GGAAATDKPK 
DAGESEQPAN 
GSQNPASSTN 
NNFLDEEVQL 
GINQYIIFYK 
VSLTGHSGNI 
VYNGEVLHFH 
FKAAIDGNGF 
GVFAGKKEQD 
YDIDEDGTIT 
VKAAESEIEK 
KTNIVKIDEK 
QTAEETKQNV 
DIATNXDNIA 
SIADHDTRLN 
VGGYKSESAV 



ACAGTGCCGA 
ATTGATGGTC 
TGCTGAAAAA 
AAACAGTGTC 
GCCGCGCTCT 
AACGGCTGCA 
CCGGCTTCCG 
GGCACTTCGT 
GTGGTAAAAG 

SEKETEAKED 
NEDEGAQNDM 
QPDMANTADG 
PSATNSGGDF 
KSEFEKLSDA 
PKPTSFARFR 
FAPEGNYRYL 
TENGRPSPSR 
KGTWTENGGG 
GSGGGGATND 
KKDATAADVE 
LTTKLADTDA 
LEAVADTVDK 
DAKVKAAETA 
KKANSADVYT 
GLDKWSDLR 
AIGTGFRFTE 



CGTGTACACC 
TGAACGCTAC 
TCCATTGCCG 
AGACCTGCGC 
CCGGTCTGTT 
GTCGGCGGCT 
CTTTACCGAA 
CCGGTTCTTC 
CTT 

APQAGSQGQG 
PQNAADTDSL 
MQGDDPSAGG 
GRTNVGNSW 
DKISNYKKDG 
RSARSRRSLP 
TYGAEKLPGG 
GRFAAKVDFG 
DVSGKFYGPA 
DDVKKAATVA 
ADDFKGLGLK 
ALADTDAALD 
HAEAFNDIAD 
AGKAEAAAGT 
REESDSKFVR 
KETRQGLAEQ 
NFAAKAGVAV 



AGAGAAGAGT 
TACCGAAAAA 
ATCACGATAC 
AAAGAAACCC 
CCAACCTTAC 
ACAAATCCGA 
AACTTTGCCG 
CGCAGCCTAC 



APSAQGGQDM 
TPNHTPASNM 
ENAGNTAAQG 
IDGPSQNITL 
KNDGKNDXFV 
AEMPLIPVNQ 
SYALRVQGEP 
SKSVDGIIDS 
GEEVAGKYSY 
IAAAYNNGQE 
KWTNLTKTV 
ATTNALNKLG 
SLDETNTKAD 
ANTAADKAEA 
IDGLNATTEK 
AALSGLFQPY 
GTSSGSSAAY 



AG983 and hybrids 

Bactericidal titres generated in response to AG983 (His-fusion) were measured against 
various strains, including the homologous 2996 strain: ; 





2996 


NGH38 


BZ133 


AG983 


512 


128 


128 



AG983 was also expressed as a hybrid, with ORF46.1, 741, 961 or 961c at its C-terminus: r 

AG983-ORF46.1 ! 



1 
51 
101 
151 
201 
251 
301 
351 
401 
451 
501 
551 
601 
651 
701 
751 
801 
851 
901 
951 
1001 
1051 
1101 



ATGACTTCTG 
CAGCAGAGCA 
AGAACGAAAT 
GTTGCGGTTA 
GCATACCGGA 
ACCTCAAACC 
GGTATCGTCG 
GTATGGCAGA 
CGTATATGCG 
GCTTCTTTCG 
TATCCGCCAC 
TTGGCGGGCG 
GCGACGCTAC 
GGTTGCAGCC 
GCATCGTCAA 
CTTTTCCAAA 
CTATTCCGGC 
GCGATTACGG 
ATCTTTTCGA 
ATTGCCATTT 
GCGTAGACCG 
GGTACAGAAC 
GTGGTGCCTG 



CGCCCGACTT 
ACAACAGCGA 
GTGCAAAGAC 
CAGACAGGGA 
GACTTTCCAA 
TGCAATTGAA 
ACACAGGCGA 
AAAGAACACG 
GAAGGAAGCG 
ACGATGAGGC 
GTAAAAGAAA 
TTCCGTGGAC 
ACATAATGAA 
ATCCGCAATG 
TAACAGTTTT 
TAGCCAATTC 
GGTGATAAAA 
CAACCTGTCC 
CAGGCAATGA 
TATGAAAAAG 
CAGTGGAGAA 
CGCTTGAGTA 
TCGGCACCCT 



CAATGCAGGC 
AATCAGCAGC 
AGAAGCATGC 
TGCCAAAATC 
ACCCAAATGA 
GCAGGCTATA 
ATCCGTCGGC 
GCTATAACGA 
CCTGAAGACG 
CGTTATAGAG 
TCGGACACAT 
GGCAGACCTG 
TACGAATGAT 
CATGGGTCAA 
GGAACAACAT 
GGAGGAGCAG 
CAGACGAGGG 
TACCACATCC 
CGGACAAGCT 
ACGCTCAAAA 
AAGTTCAAAC 
TGGCTCCAAC 
ATGAAGCAAG 



GGTACCGGTA 
AGTATCTTAC 
TCTGTGCCGG 
AATGCCCCCC 
CGCATACAAG 
CAGGACGCGG 
AGCATATCCT 
AAATTACAAA 
GAGGCGGTAA 
ACTGAAGCAA 
CGATTTGGTC 
CAGGCGGTAT 
GAAACCAAGA 
GCTGGGCGAA 
CGAGGGCAGG 
TACCGCCAAG 
TATCCGCCTG 
GTAATAAAAA 
CAGCCCAACA 
AGGCATTATC 
GGGAAATGTA 
CATTGCGGAA 
CGTCCGTTTC 



TCGGCAGCAA 
GCCGGTATCA 
TCGGGATGAC 
CCCCGAATCT 
AATTTGATCA 
GGTAGAGGTA 
TTCCCGAACT 
AACTATACGG 
AGACATTGAA 
AGCCGACGGA 
TCCCATATTA 
TGCGCCCGAT 
ACGAAATGAT 
CGTGGCGTGC 
CACTGCCGAC 
CGTTGCTCGA 
ATGCAACAGA 
CATGCTTTTC 
CATATGCCCT 
ACAGTCGCAG 
TGGAGAACCG 
TTACTGCCAT 
ACCCGTACAA 
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1151 
1201 
1251 
1301 
1351 
1401 
1451 
1501 
1551 
1601 
1651 
1701 
1751 
1801 
1851 
1901 
1951 
2001 
2051 
2101 
2151 
2201 
2251 
2301 
2351 
2401 
2451 
2501 
2551 
2601 
2651 
2701 
2751 
2801 
2851 
2901 
2951 
3001 
3051 
3101 
3151 
3201 
3251 
3301 
3351 
3401 
3451 
3501 
3551 
3601 
3651 
3701 
3751 
3801 
3851 
3901 
3951 
4001 
4051 
4101 
4151 
4201 
4251 
4301 
4351 
4401 



ACCCGATTCA 
ACGGCGGCTC 
GCGTACCACG 
ACAGCAAGTT 
CCCGCGTCCT 
CGATATTGCC 
TCAAAAAAGG 
GGCAAAACCA 
ATCGGATATG 
CATCCGGCGG 
GACCAATCCG 
GGACGGCAAA 
ACGGTACGGC 
GGGGCAGGCT 
CGCCAAAATC 
GCGGCCTGCT 
GGCGACACGC 
TTCGGCAGCG 
AGGGCGGCAG 
TCATCCGCAA 
TATGCCGGGC 
TACAGCATGC 
GCTACCGTCT 
CCGCCTGAAA 
GCGTCATCGC 
GTTGAAGGCA 
AACCGGCGAA 
CATGGAGCGA 
GCAGGCATAC 
CTCCTACGGA 
AACATGCGGA 
GGCGGTGTCA 
CGGTCTGCGC 
GTGCTTTGGG 
CTCGCGGGTC 
TGCAACGGCG 
CGGGCGGCTT 
AATATGCCGC 
CGGCAACGGC 
AGTACGGCAA 
GGTGGCGGAG 
GCAGGTTCTC 
TCGGCAGCAG 
AAAATACAAA 
TAAAGGAAAT 
TCCATTCCCC 
GGTAGTCCCG 
CGAACACCAT 
CCGCTCCCAA 
GCCCAAAATA 
GCTTGCCGAC 
GCGACGGATT 
GGCAATGCCG 
CATCGGCGCG 
TAAGCGAAGG 
ACCGAAAACA 
CAAAGACTAT 
ATGCCGCACA 
CCCATCAAAG 
CACGGCACAT 
AAGGGAAATC 
TACCCGTCCC 
TTACGGCAAA 
AAAATGTCAA 
GACGGTAAAG 
CGAGCACCAC 



AATTGCCGGA 
TGCTGCTGCA 
TTGCTGACGA 
CGGCTGGGGA 
TTCCGTTCGG 
TACTCCTTCC 
CGGCAGCCAA 
TTATCGAAGG 
CGCGTCGAAA 
CAGCCTGAAC 
GCGCAAACGA 
GGTACGCTGT 
GATTATCGGC 
ATCTCAACAG 
GGGCAGGATT 
GGCTTCCCTC 
TGTCCTATTA 
GCACATTCCG 
CAATCTGGAA 
CACCCGAGAC 
ATCCGCCCCT 
GAATGCCGCC 
ATGCCGACAG 
GCCGTATCGG 
GCAAACCCAA 
AAATGCGCGG 
AATACGACAG 
AAACAGTGCA 
GGCACGATGC 
CGCTACAAAA 
AGGCAGCGTC 
ACGTTCCGTT 
TACGACCTGC 
CTGGAGCGGC 
TGAAGCTGTC 
GGCGTGGAAC 
TACCGGCGCG 
ACACCCGTCT 
TGGAACGGCT 
CCACAGCGGA 
GCACTGGATC 
GACCGTCAGC 
GGGGGAACTT 
GCCATCAGTT 
ATCGGCTACA 
CTTCGACAAC 
TTGACGGATT 
CCCGCCGACG 
AGGCGCGAGG 
TCCGCCTCAA 
CGTTTCCACA 
CAAACGCGCC 
CCGAAGCCTT 
GCAGGAGAAA 
CTCAAACATT 
AGATGGCGCG 
GCCGCAGCAG 
AGGCATAGAA 
GGATTGGAGC 
CCTATCAAGC 
CGCCGTCAGC 
CTTACCATTC 
GAAAACATCA 
ACTGGCAGAC 
GGTTTCCGAA 
CACCACCACC 



ACATCCTTTT 
GAAATACCCG 
CGGCTCAGGA 
CTGCTGGATG 
CGACTTTACC 
GTAACGACAT 
CTGCAACTGC 
CGGTTCGCTG 
CCAAAGGTGC 
AGCGACGGCA 
AACCGTACAC 
ACACACGTTT 
GGCAAGCTGT 
TACCGGACGA 
ATTCTTTCTT 
GACAGCGTCG 
TGTCCGTCGC 

CGCCCGCCGG 
AACCTGATGG 
GGTTGAAACT 
ACGGCGCAAC 
GACGGTGTAC 
TACCGCCGCC 
ACGGGTTGGA 
CAGGACGGTG 
CAGTACCCAA 
CAGCCGCCAC 
AATGCAAAAA 
GGGCGATATC 
ACAGCATCAG 
AACGGCACGC 
TGCCGCAACG 
TCAAACAGGA 
AACAGCCTCA 
GCAACCCTTG 
GCGACCTGAA 
ACTGCAGCAA 
GGTTGCCGGC 
TGGCACGTTA 
CGAGTCGGCG 
CTCAGATTTG 
ATTTCGAACC 
GCCGAGCGCA 
GGGCAACCTG 
TTGTCCGCTT 
CATGCCTCAC 
TAGCCTTTAC 
GCTATGACGG 
GATATATACA 
CCTGACCGAC 
ATGCCGGTAG 
ACCCGATACA 
CAACGGCACT 
TTGTCGGCGC 
GCTGTCATGC 
CATCAACGAT 
CCATCCGCGA 
GCCGTCAGCA 
TGTTCGGGGA 
GGTCGCAGAT 
GACAATTTTG 
CCGAAATATC 
CCTCCTCAAC 
CAACGCCACC 
TTTTGAGAAG 
ACTGA 



CCGCACCCAT 
TGGATGAGCA 
CATCGGTGCA 
CGGGTAAGGC 
GCCGATACGA 
TTCAGGCACG 
ACGGCAACAA 
GTGTTGTACG 
GCTGATTTAT 
TTGTCTATCT 
ATCAAAGGCA 
GGGCAAACTG 
ACATGTCGGC 
CGTGTTCCCT 
CACAAACATC 
AAAAAACAGC 
GGCAATGCGG 
TCTGAAACAC 
TCGAACTGGA 
GCGGCAGCCG 
TTTCCGCGCA 
GCATCTTCAA 
CATGCCGATA 
CCACAACGGC 
GAACGTGGGA 
ACCGTCGGCA 
ACTGGGCATG 
CCGACAGCAT 
GGCTATCTCA 
CCGCAGCACC 
TGATGCAGCT 
GGAGATTTGA 
TGCATTCGCC 
CTGAAGGCAC 
AGCGATAAAG 
CGGACGCGAC 
CCGGCAAGAC 
CTGGGCGCGG 
CAGCTACGCC 
TAGGCTACCG 
GCAAACGATT 
CGACGGGAAA 
GCGGCCATAT 
ATGATTCAAC 
TTCCGATCAC 
ATTCCGATTC 
CGCATCCATT 
GCCACAGGGC 
GCTACGACAT 
AACCGCAGCA 
TATGCTGACG 
GCCCCGAGCT 
GCAGATATCG 
AGGCGATGCC 
ACGGCTTGGG 
TTGGCAGATA 
TTGGGCAGTC 
ATATCTTTAT 
AAATACGGCT 
GGGCGCGATC 
CCGATGCGGC 
CGTTCAAACT 
CGTGCCGCCG 
CGAAGACAGG 
CACGTGAAAT 



CGTAACCGGC 
ACGACAACCT 
GTCGGCGTGG 
CATGAACGGA 
AAGGTACATC 
GGCGGCCTGA 
CACCTATACG 
GCAACAACAA 
AACGGGGCGG 
GGCAGATACC 
GTCTGCAGCT 
CTGAAAGTGG 
ACGCGGCAAG 
TCCTGAGTGC 
GAAACCGACG 
GGGCAGTGAA 
CACGGACTGC 
GCCGTAGAAC 
TGCCTCCGAA 
ACCGCACAGA 
GCGGCAGCCG 
CAGTCTCGCC 
TGCAGGGACG 
ACGGGTCTGC 
ACAGGGCGGT 
TTGCCGCGAA 
GGACGCAGCA 
TAGTCTGTTT 
AAGGCCTGTT 
GGTGCGGACG 
GGGCGCACTG 
CGGTCGAAGG 
GAAAAAGGCA 
GCTGGTCGGA 
CCGTCCTGTT 
TACACGGTAA 
GGGGGCACGC 
ATGTCGAATT 
GGTTCCAAAC 
GTTCCTCGAC 
CTTTTATCCG 
TACCACCTAT 
CGGATTGGGA 
AGGCGGCCAT 
GGGCACGAAG 
TGATGAAGCC 
GGGACGGATA 
GGCGGCTATC 
AAAAGGCGTT 
CCGGACAACG 
CAAGGAGTAG 
GGACAGATCG 
TTAAAAACAT 
GTGCAGGGCA 
TCTGCTTTCC 
TGGCGCAACT 
CAAAACCCCA 
GGCAGCCATC 
TGGGCGGCAT 
GCATTGCCGA 
ATACGCCAAA 
TGGAGCAGCG 
TCAAACGGCA 
CGTACCGTTT 
ATGATACGCT 
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MTSAPDFNAG 
VAVTDRDAKI 
GIVDTGESVG 
ASFDDEAVTE 
ATLHIMNTND 
LFQIANSEEQ 
IFSTGNDAQA 
GTEPLEYGSN 
TAALLLQKYP 
PASFPFGDFT 
GKTIIEGGSL 
DQSGANETVH 
GAGYLNSTGR 
GDTLSYYVRR 
SSATPETVET 
ATVYADSTAA 
VEGKMRGSTQ 
AGIRHDAGDI 
GGVNVPFAAT 
LAGLKLSQPL 
NMPHTRLVAG 
GGGGTGSSDL 
KIQSHQLGNL 
GSPVDGFSLY 
AQNIRLNLTD 
GNAAEAFNGT 
TENKMARIND 
PIKGIGAVRG 
YPSPYHSRNI 
DGKGFPNFEK 



GTGIGSNSRA 
NAPPPNLHTG 
SISFPELYGR 
TEAKPTDIRH 
ETKNEMMVAA 
YRQALLDYSG 
QPNTYALLPF 
HCGITAMWCL 
WMSNDNLRTT 
ADTKGTSDIA 
VLYGNNKSDM 
IKGSLQLDGK 
RVPFLSAAKI 
GNAARTASAA 
AAADRTDMPG 
HADMQGRRLK 
TVGIAAKTGE 
GYLKGLFSYG 
GDLTVEGGLR 
SDKAVLFATA 
LGADVBFGNG 
ANDSFIRQVL 
MIQQAAIKGN 
RIHWDGYEHH 
NRSTGQRLAD 
ADIVKNIIGA 
LADMAQLKDY 
KYGLGGITAH 
RSNLEQRYGK 
HVKYDTLEHH 



TTAKSAAVSY 
DFPNPNDAYK 
KEHGYNENYK 
VKEIGHIDLV 
IRNAWVKLGE 
GDKTDEGIRL 
YEKDAQKGII 
SAPYEASVRF 
LLTTAQDIGA 
YSFRNDISGT 
RVETKGALIY 
GTLYTRLGKL 
GQDYSFFTNI 
AHSAPAGLKH 
IRPYGATFRA 
AVSDGLDHNG 
NTTAAATLGM 
RYKNSISRST 
YDLLKQDAFA 
GVERDLNGRD 
WNGLARYSYA 
DRQHFEPDGK 
IGYIVRFSDH 
PADGYDGPQG 
RFHNAGSMLT 
AGEIVGAGDA 
AAAAI RDWAV 
PIKRSQMGAI 
ENITSSTVPP 
HHHH* 



AGIKNEMCKD 
NLINLKPAIE 
NYTAYMRKEA 
SHIIGGRSVD 
RGVRIVNNSF 
MQQSDYGNLS 
TVAGVDRSGE 
TRTNPIQIAG 
VGVDSKFGWG 
GGLIKKGGSQ 
NGAASGGSLN 
LKVDGTAIIG 
ETDGGLliASL 
AVEQGGSNLE 
AAAVQHANAA 
TGLRVIAQTQ 
GRSTWSENSA 
GADEHAEGSV 
EKGSALGWSG 
YTVTGGFTGA 
GSKQYGNHSG 
YHLFGSRGEL 
GHEVHSPFDN 
GGYPAPKGAR 
QGVGDGFKRA 
VQGISEGSNI 
QNPNAAQGIE 
ALPKGKSAVS 
SNGKNVKLAD 



RSMLCAGRDD 
AGYTGRGVEV 
PEDGGGKDIE 
GRPAGGIAPD 
GTTSRAGTAD 
YHIRNKNMLF 
KFKREMYGEP 
TSFSAPIVTG 
LLDAGKAMNG 
LQLHGNNTYT 
SDGIVYLADT 
GKLYMSARGK 
DSVEKTAGSE 
NLMVELDASE 
DGVRIFNSLA 
QDGGTWEQGG 
NAKTDSISLF 
NGTLMQLGAL 
NSLTEGTLVG 
TAATGKTGAR 
RVGVGYRFLD 
AERSGHIGLG 
HASHSDSDEA 
DIYSYDIKGV 
TRYSPELDRS 
AVMHGLGLLS 
AVSNIFMAAI 
DNFADAAYAK 
QRHPKTGVPF 
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i 

51 
101 
151 
201 
251 
301 
351 
401 
451 
501 
551 
601 
651 
701 
751 
801 
851 
901 
951 
1001 
1051 
1101 
1151 
1201 
1251 
1301 
1351 
1401 
1451 
1501 
1551 
1601 
1651 



ATGACTTCTG 
CAGCAGAGCA 
AGAACGAAAT 
GTTGCGGTTA 
GCATACCGGA 
ACCTCAAACC 
GGTATCGTCG 
GTATGGCAGA 
CGTATATGCG 
GCTTCTTTCG 
TATCCGCCAC 
TTGGCGGGCG 
GCGACGCTAC 
GGTTGCAGCC 
GCATCGTCAA 
CTTTTCCAAA 
CTATTCCGGC 
GCGATTACGG 
ATCTTTTCGA 
ATTGCCATTT 
GCGTAGACCG 
GGTACAGAAC 
GTGGTGCCTG 
ACCCGATTCA 
ACGGCGGCTC 
GCGTACCACG 
ACAGCAAGTT 
CCCGCGTCCT 
CGATATTGCC 
TCAAAAAAGG 
GGCAAAACCA 
ATCGGATATG 
CATCCGGCGG 
GACCAATCCG 



CGCCCGACTT 
ACAACAGCGA 
GTGCAAAGAC 
CAGACAGGGA 
GACTTTCCAA 
TGCAATTGAA 
ACACAGGCGA 
AAAGAACACG 
GAAGGAAGCG 
ACGATGAGGC 
GTAAAAGAAA 
TTCCGTGGAC 
ACATAATGAA 
ATCCGCAATG 
TAACAGTTTT 
TAGCCAATTC 
GGTGATAAAA 
CAACCTGTCC 
CAGGCAATGA 
TATGAAAAAG 
CAGTGGAGAA 
CGCTTGAGTA 
TCGGCACCCT 
AATTGCCGGA 
TGCTGCTGCA 
TTGCTGACGA 
CGGCTGGGGA 
TTCCGTTCGG 
TACTCCTTCC 
CGGCAGCCAA 
TTATCGAAGG 
CGCGTCGAAA 
CAGCCTGAAC 
GCGCAAACGA 



CAATGCAGGC 
AATCAGCAGC 
AGAAGCATGC 
TGCCAAAATC 
ACCCAAATGA 
GCAGGCTATA 
ATCCGTCGGC 
GCTATAACGA 
CCTGAAGACG 
CGTTATAGAG 
TCGGACACAT 
GGCAGACCTG 
TACGAATGAT 
CATGGGTCAA 
GGAACAACAT 
GGAGGAGCAG 
CAGACGAGGG 
TACCACATCC 
CGCACAAGCT 
ACGCTCAAAA 
AAGTTCAAAC 
TGGCTCCAAC 
ATGAAGCAAG 
ACATCCTTTT 
GAAATACCCG 
CGGCTCAGGA 
CTGCTGGATG 
CGACTTTACC 
GTAACGACAT 
CTGCAACTGC 
CGGTTCGCTG 
CCAAAGGTGC 
AGCGACGGCA 
AACCGTACAC 



GGTACCGGTA 
AGTATCTTAC 
TCTGTGCCGG 
AATGCCCCCC 
CGCATACAAG 
CAGGACGCGG 
AGCATATCCT 
AAATTACAAA 
GAGGCGGTAA 
ACTGAAGCAA 
CGATTTGGTC 
CAGGCGGTAT 
GAAACCAAGA 
GCTGGGCGAA 
CGAGGGCAGG 
TACCGCCAAG 
TATCCGCCTG 
GTAATAAAAA 
CAGCCCAACA 
AGGCATTATC 
GGGAAATGTA 
CATTGCGGAA 
CGTCCGTTTC 
CCGCACCCAT 
TGGATGAGCA 
CATCGGTGCA 
CGGGTAAGGC 
GCCGATACGA 
TTCAGGCACG 
ACGGCAACAA 
GTGTTGTACG 
GCTGATTTAT 
TTGTCTATCT 
ATCAAAGGCA 



TCGGCAGCAA 
GCCGGTATCA 
TCGGGATGAC 
CCCCGAATCT 
AATTTGATCA 
GGTAGAGGTA 
TTCCCGAACT 
AACTATACGG 
AGACATTGAA 
AGCCGACGGA 
TCCCATATTA 
TGCGCCCGAT 
ACGAAATGAT 
CGTGGCGTGC 
CACTGCCGAC 
CGTTGCTCGA 
ATGCAACAGA 
CATGCTTTTC 
CATATGCCCT 
ACAGTCGCAG 
TGGAGAACCG 
TTACTGCCAT 
ACCCGTACAA 
CGTAACCGGC 
ACGACAACCT 
GTCGGCGTGG 
CATGAACGGA 
AAGGTACATC 
GGCGGCCTGA 
CACCTATACG 
GCAACAACAA 
AACGGGGCGG 
GGCAGATACC 
GTCTGCAGCT 
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1701 GGACGGCAAA GGTACGCTGT ACACACGTTT GGGCAAACTG CTGAAAGTGG 

1751 ACGGTACGGC GATTATCGGC GGCAAGCTGT ACATGTCGGC ACGCGGCAAG 

1801 GGGGCAGGCT ATCTCAACAG TACCGGACGA CGTGTTCCCT TCCTGAGTGC 

1851 CGCCAAAATC GGGCAGGATT ATTCTTTCTT CACAAACATC GAAACCGACG 

5 1901 GCGGCCTGCT GGCTTCCCTC GACAGCGTCG AAAAAACAGC GGGCAGTGAA 

1951 GGCGACACGC TGTCCTATTA TGTCCGTCGC GGCAATGCGG CACGGACTGC 

2001 TTCGGCAGCG GCACATTCCG CGCCCGCCGG TCTGAAACAC GCCGTAGAAC 

2051 AGGGCGGCAG CAATCTGGAA AACCTGATGG TCGAACTGGA TGCCTCCGAA 

2101 TCATCCGCAA CACCCGAGAC GGTTGAAACT GCGGCAGCCG ACCGCACAGA 

10 2151 TATGCCGGGC ATCCGCCCCT ACGGCGCAAC TTTCCGCGCA GCGGCAGCCG 

2201 TACAGCATGC GAATGCCGCC GACGGTGTAC GCATCTTCAA CAGTCTCGCC 

2251 GCTACCGTCT ATGCCGACAG TACCGCCGCC CATGCCGATA TGCAGGGACG 

2301 CCGCCTGAAA GCCGTATCGG ACGGGTTGGA CCACAACGGC ACGGGTCTGC 

2351 GCGTCATCGC GCAAACCCAA CAGGACGGTG GAACGTGGGA ACAGGGCGGT 

15 2 All GTTGAAGGCA AAATGCGCGG CAGTACCCAA ACCGTCGGCA TTGCCGCGAA 

2451 AACCGGCGAA AATACGACAG CAGCCGCCAC ACTGGGCATG GGACGCAGCA 

2501 CATGGAGCGA AAACAGTGCA AATGCAAAAA CCGACAGCAT TAGTCTGTTT 

2551 GCAGGCATAC GGCACGATGC GGGCGATATC GGCTATCTCA AAGGCCTGTT 

2601 CTCCTACGGA CGCTACAAAA ACAGCATCAG CCGCAGCACC GGTGCGGACG 

20 2651 AACATGCGGA AGGCAGCGTC AACGGCACGC TGATGCAGCT GGGCGCACTG 

2701 GGCGGTGTCA ACGTTCCGTT TGCCGCAACG GGAGATTTGA CGGTCGAAGG 

2751 CGGTCTGCGC TACGACCTGC TCAAACAGGA TGCATTCGCC GAAAAAGGCA 

2801 GTGCTTTGGG CTGGAGCGGC AACAGCCTCA CTGAAGGCAC GCTGGTCGGA 

2851 CTCGCGGGTC TGAAGCTGTC GCAACCCTTG AGCGATAAAG CCGTCCTGTT 

25 2901 TGCAACGGCG GGCGTGGAAC GCGACCTGAA CGGACGCGAC TACACGGTAA 

2951 CGGGCGGCTT TACCGGCGCG ACTGCAGCAA CCGGCAAGAC GGGGGCACGC 

3001 AATATGCCGC ACACCCGTCT GGTTGCCGGC CTGGGCGCGG ATGTCGAATT 

3051 CGGCAACGGC TGGAACGGCT TGGCACGTTA CAGCTACGCC GGTTCCAAAC 

3101 AGTACGGCAA CCACAGCGGA CGAGTCGGCG TAGGCTACCG GTTCCTCGAG 

30 3151 GGATCCGGAG GGGGTGGTGT CGCCGCCGAC ATCGGTGCGG GGCTTGCCGA 

3201 TGCACTAACC GCACCGCTCG ACCATAAAGA CAAAGGTTTG CAGTCTTTGA 

3251 CGCTGGATCA GTCCGTCAGG AAAAACGAGA AACTGAAGCT GGCGGCACAA 

3301 GGTGCGGAAA AAACTTATGG AAACGGTGAC AGCCTCAATA CGGGCAAATT 

3351 GAAGAACGAC AAGGTCAGCC GTTTCGACTT TATCCGCCAA ATCGAAGTGG 

35 3401 ACGGGCAGCT CATTACCTTG GAGAGTGGAG AGTTCCAAGT ATACAAACAA 

3451 AGCCATTCCG CCTTAACCGC CTTTCAGACC GAGCAAATAC AAGATTCGGA 

3501 GCATTCCGGG AAGATGGTTG CGAAACGCCA GTTCAGAATC GGCGACATAG 

3551 CGGGCGAACA TACATCTTTT GACAAGCTTC CCGAAGGCGG CAGGGCGACA 

3601 TATCGCGGGA CGGCGTTCGG TTCAGACGAT GCCGGCGGAA AACTGACCTA 

40 3651 CACCATAGAT TTCGCCGCCA AGCAGGGAAA CGGCAAAATC GAACATTTGA 

3701 AATCGCCAGA ACTCAATGTC GACCTGGCCG CCGCCGATAT CAAGCCGGAT 

3751 GGAAAACGCC ATGCCGTCAT CAGCGGTTCC GTCCTTTACA ACCAAGCCGA 

3801 GAAAGGCAGT TACTCCCTCG GTATCTTTGG CGGAAAAGCC CAGGAAGTTG 

3851 CCGGCAGCGC GGAAGTGAAA ACCGTAAACG GCATACGCCA TATCGGCCTT 

45 3901 GCCGCCAAGC AACTCGAGCA CCACCACCAC CACCACTGA 

1 MTSAPDFNAG GTGIGSNSRA TTAKSAAVSY AGIKNEMCKD RSMLCAGRDD 

51 VAVTDRDAKI NAPPPNLHTG DFPNPNDAYK NLINLKPAIE AGYTGRGVEV 

101 GXVDTGESVG SISFPELYGR KEHGYNENYK NYTAYMRKEA PEDGGGKDIE 

50 151 ASFDDEAVTE TEAKPTDIRH VKEIGHIDLV SHIIGGRSVD GRPAGGIAPD 

201 ATLHIMNTND ETKNEMMVAA IRNAWVKLGE RGVRXVNNSF GTTSRAGTAD 

251 LFQIANSEEQ YRQALLDYSG GDKTDEGIRL MQQSDYGNLS YHIRNKNMLF 

301 IFSTGNDAQA QPNTYALLPF YEKDAQKGII TVAGVDRSGE KFKREMYGEP 

351 GTEPLEYGSN HCGITAMWCL SAPYEASVRF TRTNPIQIAG TSFSAPIVTG 

55 401 TAALLLQKYP WMSNDNLRTT LLTTAQDIGA VGVDSKFGWG LLDAGKAMNG 

451 PASFPFGDFT ADTKGTSDIA YSFRNDISGT GGLIKKGGSQ LQLHGNNTYT 

501 GKTIIEGGSL VLYGNNKSDM RVETKGALIY NGAASGGSLN SDGIVYLADT 

551 DQSGANETVH IKGSLQLDGK GTLYTRLGKL LKVDGTAIIG GKLYMSARGK 

601 GAGYLNSTGR RVPFLSAAKI GQDYSFFTNI ETDGGLLASL DSVEKTAGSE 

60 651 GDTLSYYVRR GNAARTASAA AHSAPAGLKH AVEQGGSNLE NLMVELDASE 

701 SSATPETVET AAADRTDMPG IRPYGATFRA AAAVQHANAA DGVRIFNSLA 

751 ATVYADSTAA HADMQGRRLK AVSDGLDHNG TGLRVIAQTQ QDGGTWEQGG 

801 VEGKMRGSTQ TVGIAAKTGE NTTAAATLGM GRSTWSENSA NAKTDSISLF 

851 AGIRHDAGDI GYLKGLFSYG RYKNSISRST GADEHAEGSV NGTLMQLGAL 

65 901 GGVNVPFAAT GDLTVEGGLR YDLLKQDAFA EKGSALGWSG NSLTEGTLVG 

951 LAGLKLSQPL SDKAVLFATA GVERDLNGRD YTVTGGFTGA TAATGKTGAR 

1001 NMPHTRLVAG LGADVEFGNG WNGLARYSYA GSKQYGNHSG RVGVGYRFLE 
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1051 
1101 
1151 
1201 
1251 
1301 



GSGGGGVAAD 
GAEKTYGNGD 
SHSALTAFQT 
YRGTAFGSDD 
GKRHAVISGS 
AAKQLEHHHH 



IGAGLADALT 
SLNTGKLKND 
EQIQDSEHSG 
AGGKLTYTID 
VLYNQAEKGS 
HH* 



APLDHKDKGL 
KVSRFDFIRQ 
KMVAKRQFRI 
PAAKQGNGKI 
YSLGIFGGKA 



QSLTLDQSVR 
IEVDGQLITL 
GDIAGEHTSF 
EHLKSPELNV 
QEVAGSAEVK 
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1 
51 
101 
151 
201 
251 
301 
351 
401 
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501 
551 
601 
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751 
801 
851 
901 
951 
1001 
1051 
1101 
1151 
1201 
1251 
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1351 
1401 
1451 
1501 
1551 
1601 
1651 
1701 
1751 
1801 
1851 
1901 
1951 
2001 
2051 
2101 
2151 
2201 
2251 
2301 
2351 
2401 
2451 
2501 
2551 
2601 
2651 
2701 
2751 
2801 
2851 



ATGACTTCTG 
CAGCAGAGCA 
AGAACGAAAT 
GTTGCGGTTA 
GCATACCGGA 
ACCTCAAACC 
GGTATCGTCG 
GTATGGCAGA 
CGTATATGCG 
GCTTCTTTCG 
TATCCGCCAC 
TTGGCGGGCG 
GCGACGCTAC 
GGTTGCAGCC 
GCATCGTCAA 
CTTTTCCAAA 
CTATTCCGGC 
GCGATTACGG 
ATCTTTTCGA 
ATTGCCATTT 
GCGTAGACCG 
GGTACAGAAC 
GTGGTGCCTG 
ACCCGATTCA 
ACGGCGGCTC 
GCGTACCACG 
ACAGCAAGTT 
CCCGCGTCCT 
CGATATTGCC 
TCAAAAAAGG 
GGCAAAACCA 
ATCGGATATG 
CATCCGGCGG 
GACCAATCCG 
GGACGGCAAA 
ACGGTACGGC 
GGGGCAGGCT 
CGCCAAAATC 
GCGGCCTGCT 
GGCGACACGC 
TTCGGCAGCG 
AGGGCGGCAG 
TCATCCGCAA 
TATGCCGGGC 
TACAGCATGC 
GCTACCGTCT 
CCGCCTGAAA 
GCGTCATCGC 
GTTGAAGGCA 
AACCGGCGAA 
CATGGAGCGA 
GCAGGCATAC 
CTCCTACGGA 
AACATGCGGA 
GGCGGTGTCA 
CGGTCTGCGC 
GTGCTTTGGG 
CTCGCGGGTC 



CGCCCGACTT 
ACAACAGCGA 
GTGCAAAGAC 
CAGACAGGGA 
GACTTTCCAA 
TGCAATTGAA 
ACACAGGCGA 
AAAGAACACG 
GAAGGAAGCG 
ACGATGAGGC 
GTAAAAGAAA 
TTCCGTGGAC 
ACATAATGAA 
ATCCGCAATG 
TAACAGTTTT 
TAGCCAATTC 
GGTGATAAAA 
CAACCTGTCC 
CAGGCAATGA 
TATGAAAAAG 
CAGTGGAGAA 
CGCTTGAGTA 
TCGGCACCCT 
AATTGCCGGA 
TGCTGCTGCA 
TTGCTGACGA 
CGGCTGGGGA 
TTCCGTTCGG 
TACTCCTTCC 
CGGCAGCCAA 
TTATCGAAGG 
CGCGTCGAAA 
CAGCCTGAAC 
GCGCAAACGA 
GGTACGCTGT 
GATTATCGGC 
ATCTCAACAG 
GGGCAGGATT 
GGCTTCCCTC 
TGTCCTATTA 
GCACATTCCG 
CAATCTGGAA 
CACCCGAGAC 
ATCCGCCCCT 
GAATGCCGCC 
ATGCCGACAG 
GCCGTATCGG 
GCAAACCCAA 
AAATGCGCGG 
AATACGACAG 
AAACAGTGCA 
GGCACGATGC 
CGCTACAAAA 
AGGCAGCGTC 
ACGTTCCGTT 
TACGACCTGC 
CTGGAGCGGC 
TGAAGCTGTC 



CAATGCAGGC 
AATCAGCAGC 
AGAAGCATGC 
TGCCAAAATC 
ACCCAAATGA 
GCAGGCTATA 
ATCCGTCGGC 
GCTATAACGA 
CCTGAAGACG 
CGTTATAGAG 
TCGGACACAT 
GGCAGACCTG 
TACGAATGAT 
CATGGGTCAA 
GGAACAACAT 
GGAGGAGCAG 
CAGACGAGGG 
TACCACATCC 
CGCACAAGCT 
ACGCTCAAAA 
AAGTTCAAAC 
TGGCTCCAAC 
ATGAAGCAAG 
ACATCCTTTT 
GAAATACCCG 
CGGCTCAGGA 
CTGCTGGATG 
CGACTTTACC 
GTAACGACAT 
CTGCAACTGC 
CGGTTCGCTG 
CCAAAGGTGC 
AGCGACGGCA 
AACCGTACAC 
ACACACGTTT 
GGCAAGCTGT 
TACCGGACGA 
ATTCTTTCTT 
GACAGCGTCG 
TGTCCGTCGC 
CGCCCGCCGG 
AACCTGATGG 
GGTTGAAACT 
ACGGCGCAAC 
GACGGTGTAC 
TACCGCCGCC 
ACGGGTTGGA 
CAGGACGGTG 
CAGTACCCAA 
CAGCCGCCAC 
AATGCAAAAA 
GGGCGATATC 
ACAGCATCAG 
AACGGCACGC 
TGCCGCAACG 
TCAAACAGGA 
AACAGCCTCA 
GCAACCCTTG 



GGTACCGGTA 
AGTATCTTAC 
TCTGTGCCGG 
AATGCCCCCC 
CGCATACAAG 
CAGGACGCGG 
AGCATATCCT 
AAATTACAAA 
GAGGCGGTAA 
ACTGAAGCAA 
CGATTTGGTC 
CAGGCGGTAT 
GAAACCAAGA 
GCTGGGCGAA 
CGAGGGCAGG 
TACCGCCAAG 
TATCCGCCTG 
GTAATAAAAA 
CAGCCCAACA 
AGGCATTATC 
GGGAAATGTA 
CATTGCGGAA 
CGTCCGTTTC 
CCGCACCCAT 
TGGATGAGCA 
CATCGGTGCA 
CGGGTAAGGC 
GCCGATACGA 
TTCAGGCACG 
ACGGCAACAA 
GTGTTGTACG 
GCTGATTTAT 
TTGTCTATCT 
ATCAAAGGCA 
GGGCAAACTG 
ACATGTCGGC 
CGTGTTCCCT 
CACAAACATC 
AAAAAACAGC 
GGCAATGCGG 
TCTGAAACAC 
TCGAACTGGA 
GCGGCAGCCG 
TTTCCGCGCA 
GCATCTTCAA 
CATGCCGATA 
CCACAACGGC 
GAACGTGGGA 
ACCGTCGGCA 
ACTGGGCATG 
CCGACAGCAT 
GGCTATCTCA 
CCGCAGCACC 
TGATGCAGCT 
GGAGATTTGA 
TGCATTCGCC 
CTGAAGGCAC 
AGCGATAAAG 



TCGGCAGCAA 
GCCGGTATCA 
TCGGGATGAC 
CCCCGAATCT 
AATTTGATCA 
GGTAGAGGTA 
TTCCCGAACT 
AACTATACGG 
AGACATTGAA 
AGCCGACGGA 
TCCCATATTA 
TGCGCCCGAT 
ACGAAATGAT 
CGTGGCGTGC 
CACTGCCGAC 
CGTTGCTCGA 
ATGCAACAGA 
CATGCTTTTC 
CATATGCCCT 
ACAGTCGCAG 
TGGAGAACCG 
TTACTGCCAT 
ACCCGTACAA 
CGTAACCGGC 
ACGACAACCT 
GTCGGCGTGG 
CATGAACGGA 
AAGGTACATC 
GGCGGCCTGA 
CACCTATACG 
GCAACAACAA 
AACGGGGCGG 
GGCAGATACC 
GTCTGCAGCT 
CTGAAAGTGG 
ACGCGGCAAG 
TCCTGAGTGC 
GAAACCGACG 
GGGCAGTGAA 
CACGGACTGC 
GCCGTAGAAC 
TGCCTCCGAA 
ACCGCACAGA 
GCGGCAGCCG 
CAGTCTCGCC 
TGCAGGGACG 
ACGGGTCTGC 
ACAGGGCGGT 
TTGCCGCGAA 
GGACGCAGCA 
TAGTCTGTTT 
AAGGCCTGTT 
GGTGCGGACG 
GGGCGCACTG 
CGGTCGAAGG 
GAAAAAGGCA 
GCTGGTCGGA 
CCGTCCTGTT 
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2901 TGCAACGGCG GGCGTGGAAC GCGACCTGAA CGGACGCGAC TACACGGTAA 

2951 CGGGCGGCTT TACCGGCGCG ACTGCAGCAA CCGGCAAGAC GGGGGCACGC 

3001 AATATGCCGC ACACCCGTCT GGTTGCCGGC CTGGGCGCGG ATGTCGAATT 

3051 CGGCAACGGC TGGAACGGCT TGGCACGTTA CAGCTACGCC GGTTCCAAAC 

3101 AGTACGGCAA CCACAGCGGA CGAGTCGGCG TAGGCTACCG GTTCCTCGAG 

3151 GGTGGCGGAG GCACTGGATC CGCCACAAAC GACGACGATG TTAAAAAAGC 

3201 TGCCACTGTG GCCATTGCTG CTGCCTACAA CAATGGCCAA GAAATCAACG 

3251 GTTTCAAAGC TGGAGAGACC ATCTACGACA TTGATGAAGA CGGCACAATT 

3301 ACCAAAAAAG ACGCAACTGC AGCCGATGTT GAAGCCGACG ACTTTAAAGG 

3351 TCTGGGTCTG AAAAAAGTCG TGACTAACCT GACCAAAACC GTCAATGAAA 

3401 ACAAACAAAA CGTCGATGCC AAAGTAAAAG CTGCAGAATC TGAAATAGAA 

3451 AAGTTAACAA CCAAGTTAGC AGACACTGAT GCCGCTTTAG CAGATACTGA 

3501 TGCCGCTCTG GATGCAACCA CCAACGCCTT GAATAAATTG GGAGAAAATA 

3551 TAACGACATT TGCTGAAGAG ACTAAGACAA ATATCGTAAA AATTGATGAA 

3601 AAATTAGAAG CCGTGGCTGA TACCGTCGAC AAGCATGCCG AAGCATTCAA 

3651 CGATATCGCC GATTCATTGG ATGAAACCAA CACTAAGGCA GACGAAGCCG 

3701 TCAAAACCGC CAATGAAGCC AAACAGACGG CCGAAGAAAC CAAACAAAAC 

3751 GTCGATGCCA AAGTAAAAGC TGCAGAAACT GCAGCAGGCA AAGCCGAAGC 

3801 TGCCGCTGGC ACAGCTAATA CTGCAGCCGA CAAGGCCGAA GCTGTCGCTG 

3851 CAAAAGTTAC CGACATCAAA GCTGATATCG CTACGAACAA AGATAATATT 

3901 GCTAAAAAAG CAAACAGTGC CGACGTGTAC ACCAGAGAAG AGTCTGACAG 

3951 CAAATTTGTC AGAATTGATG GTCTGAACGC TACTACCGAA AAATTGGACA 

4001 CACGCTTGGC TTCTGCTGAA AAATCCATTG CCGATCACGA TACTCGCCTG 

4051 AACGGTTTGG ATAAAACAGT GTCAGACCTG CGCAAAGAAA CCCGCCAAGG 

4101 CCTTGCAGAA CAAGCCGCGC TCTCCGGTCT GTTCCAACCT TACAACGTGG 

4151 GTCGGTTCAA TGTAACGGCT GCAGTCGGCG GCTACAAATC CGAATCGGCA 

4201 GTCGCCATCG GTACCGGCTT CCGCTTTACC GAAAACTTTG CCGCCAAAGC 

4251 AGGCGTGGCA GTCGGCACTT CGTCCGGTTC TTCCGCAGCC TACCATGTCG 

4301 GCGTCAATTA CGAGTGGCTC GAGCACCACC ACCACCACCA CTGA 

1 MTSAPDFNAG GTGIGSNSRA TTAKSAAVSY AGIKNEMCKD RSMLCAGRDD 

51 VAVTDRDAKI NAPPPNLHTG DFPNPNDAYK NLINLKPAIE AGYTGRGVEV 

101 GIVDTGESVG SISFPELYGR KEHGYNENYK NYTAYMRKEA PEDGGGKDIE 

151 ASFDDEAVIE TEAKPTDIRH VKEIGHIDLV SHIIGGRSVD GRPAGGIAPD 

201 ATLHIMNTND ETKNEMMVAA IRNAWVKLGE RGVRIVNNSF GTTSRAGTAD 

251 LFQIANSEEQ YRQALLDYSG GDKTDEGIRL MQQSDYGNLS YHIRNKNMLF 

301 IFSTGNDAQA QPNTYALLPF YEKDAQKGII TVAGVDRSGE KFKREMYGEP 

351 GTEPLEYGSN HCGITAMWCL SAPYEASVRF TRTNPIQIAG TSFSAPIVTG 

401 TAALLLQKYP WMSNDNLRTT LLTTAQDIGA VGVDSKFGWG LLDAGKAMNG 

451 PASFPFGDFT ADTKGTSDIA YSFRNDISGT GGLIKKGGSQ LQLHGNNTYT 

501 GKTIIEGGSL VLYGNNKSDM RVETKGALIY NGAASGGSLN SDGIVYLADT 

551 DQSGANETVH IKGSLQLDGK GTLYTRLGKL LKVDGTAIIG GKLYMSARGK 

601 GAGYLNSTGR RVPFLSAAKI GQDYSFFTNI ETDGGLLASL DSVEKTAGSE 

651 GDTLSYYVRR GNAARTASAA AHSAPAGLKH AVEQGGSNLE NLMVELDASE 

701 SSATPETVET AAADRTDMPG IRPYGATFRA AAAVQHANAA DGVRIFNSLA 

751 ATVYADSTAA HADMQGRRLK AVSDGLDHNG TGLRVIAQTQ QDGGTWEQGG 

801 VEGKMRGSTQ TVGIAAKTGE NTTAAATLGM GRSTWSENSA NAKTDSISLF 

851 AGIRHDAGDI GYLKGLFSYG RYKNSISRST GADEHAEGSV NGTLMQLGAL 

901 GGVNVPFAAT GDLTVEGGLR YDLLKQDAFA EKGSALGWSG NSLTEGTLVG 

951 LAGLKLSQPL SDKAVLFATA GVERDLNGRD YTVTGGFTGA TAATGKTGAR 

1001 NMPHTRLVAG LGADVEFGNG WNGLARYSYA GSKQYGNHSG RVGVGYRFLE 

1051 GGGGTGSATN DDDVKKAATV AIAAAYWNGQ EINGFKAGET IYDIDEDGTI 

1101 TKKDATAADV EADDFKGLGL KKWTNLTKT VNENKQNVDA KVKAAESEIE 

1151 KLTTKLADTD AALADTDAAL DATTNALNKL GENITTFAEE TKTNIVKIDE 

1201 KLEAVADTVD KHAEAFNDIA DSLDETNTKA DEAVKTANEA KQTAEETKQN 

1251 VDAKVKAAET AAGKAEAAAG TANTAADKAE AVAAKVTDIK ADIATNKDNI 

1301 AKKANSADVY TREESDSKFV RIDGLNATTE KLDTRLASAE KSIADHDTRL 

1351 NGLDKTVSDL RKETRQGLAE QAALSGLFQP YNVGRFNVTA AVGGYKSESA 

1401 VAIGTGFRFT ENFAAKAGVA VGTSSGSSAA YHVGVNYEWL EHHHHHH* 



AG983-961C 

1 ATGACTTCTG CGCCCGACTT CAATGCAGGC GGTACCGGTA TCGGCAGCAA 
51 CAGCAGAGCA ACAACAGCGA AATCAGCAGC AGTATCTTAC GCCGGTATCA 
101 AGAACGAAAT GTGCAAAGAC AGAAGCATGC TCTGTGCCGG TCGGGATGAC 
151 GTTGCGGTTA CAGACAGGGA TGCCAAAATC AATGCCCCCC CCCCGAATCT 
201 GCATACCGGA GACTTTCCAA ACCCAAATGA CGCATACAAG AATTTGATCA 
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951 
1001 
1051 
1101 
1151 
1201 
1251 
1301 
1351 
1401 
1451 
1501 
1551 
1601 
1651 
1701 
1751 
1801 
1851 
1901 
1951 
2001 
2051 
2101 
2151 
2201 
2251 
2301 
2351 
2401 
2451 
2501 
2551 
2601 
2651 
2701 
2751 
2801 
2851 
2901 
2951 
3001 
3051 
3101 
3151 
3201 
3251 
3301 
3351 
3401 
3451 
3501 
3551 



ACCTCAAACC 

GGTATCGTCG 

GTATGGCAGA 

CGTATATGCG 

GCTTCTTTCG 

TATCCGCCAC 

TTGGCGGGCG 

GCGACGCTAC 

GGTTGCAGCC 

GCATCGTCAA 

CTTTTCCAAA 

CTATTCCGGC 

GCGATTACGG 

ATCTTTTCGA 

ATTGCCATTT 

GCGTAGACCG 

GGTACAGAAC 

GTGGTGCCTG 

ACCCGATTCA 

ACGGCGGCTC 

GCGTACCACG 

ACAGCAAGTT 

CCCGCGTCCT 

CGATATTGCC 

TCAAAAAAGG 

GGCAAAACCA 

ATCGGATATG 

CATCCGGCGG 

GACCAATCCG 

GGACGGCAAA 

ACGGTACGGC 

GGGGCAGGCT 

CGCCAAAATC 

GCGGCCTGCT 

GGCGACACGC 

TTCGGCAGCG 

AGGGCGGCAG 

TCATCCGCAA 

TATGCCGGGC 

TACAGCATGC 

GCTACCGTCT 

CCGCCTGAAA 

GCGTCATCGC 

GTTGAAGGCA 

AACCGGCGAA 

CATGGAGCGA 

GCAGGCATAC 

CTCCTACGGA 

AACATGCGGA 

GGCGGTGTCA 

CGGTCTGCGC 

GTGCTTTGGG 

CTCGCGGGTC 

TGCAACGGCG 

CGGGCGGCTT 

AATATGCCGC 

CGGCAACGGC 

AGTACGGCAA 

GGTGGCGGAG 

TGCCACTGTG 

GTTTCAAAGC 

ACCAAAAAAG 

TCTGGGTCTG 

ACAAACAAAA 

AAGTTAACAA 

TGCCGCTCTG 

TAACGACATT 



TGCAATTGAA 

ACACAGGCGA 

AAAGAACACG 

GAAGGAAGCG 

ACGATGAGGC 

GTAAAAGAAA 

TTCCGTGGAC 

ACATAATGAA 

ATCCGCAATG 

TAACAGTTTT 

TAGCCAATTC 

GGTGATAAAA 

CAACCTGTCC 

CAGGCAATGA 

TATGAAAAAG 

CAGTGGAGAA 

CGCTTGAGTA 

TCGGCACCCT 

AATTGCCGGA 

TGCTGCTGCA 

TTGCTGACGA 

CGGCTGGGGA 

TTCCGTTCGG 

TACTCCTTCC 

CGGCAGCCAA 

TTATCGAAGG 

CGCGTCGAAA 

CAGCCTGAAC 

GCGCAAACGA 

GGTACGCTGT 

GATTATCGGC 

ATCTCAACAG 

GGGCAGGATT 

GGCTTCCCTC 

TGTCCTATTA 

GCACATTCCG 

CAATCTGGAA 

CACCCGAGAC 

ATCCGCCCCT 

GAATGCCGCC 

ATGCCGACAG 

GCCGTATCGG 

GCAAACCCAA 

AAATGCGCGG 

AATACGACAG 

AAACAGTGCA 

GGCACGATGC 

CGCTACAAAA 

AGGCAGCGTC 

ACGTTCCGTT 

TACGACCTGC 

CTGGAGCGGC 

TGAAGCTGTC 

GGCGTGGAAC 

TACCGGCGCG 

ACACCCGTCT 

TGGAACGGCT 

CCACAGCGGA 

GCACTGGATC 

GCCATTGCTG 

TGGAGAGACC 

ACGCAACTGC 

AAAAAAGTCG 

CGTCGATGCC 

CCAAGTTAGC 

GATGCAACCA 

TGCTGAAGAG 



GCAGGCTATA 

ATCCGTCGGC 

GCTATAACGA 

CCTGAAGACG 

CGTTATAGAG 

TCGGACACAT 

GGCAGACCTG 

TACGAATGAT 

CATGGGTCAA 

GGAACAACAT 

GGAGGAGCAG 

CAGACGAGGG 

TACCACATCC 

CGCACAAGCT 

ACGCTCAAAA 

AAGTTCAAAC 

TGGCTCCAAC 

ATGAAGCAAG 

ACATCCTTTT 

GAAATACCCG 

CGGCTCAGGA 

CTGCTGGATG 

CGACTTTACC 

GTAACGACAT 

CTGCAACTGC 

CGGTTCGCTG 

CCAAAGGTGC 

AGCGACGGCA 

AACCGTACAC 

ACACACGTTT 

GGCAAGCTGT 

TACCGGACGA 

ATTCTTTCTT 

GACAGCGTCG 

TGTCCGTCGC 

CGCCCGCCGG 

AACCTGATGG 

GGTTGAAACT 

ACGGCGCAAC 

GACGGTGTAC 

TACCGCCGCC 

ACGGGTTGGA 

CAGGACGGTG 

CAGTACCCAA 

CAGCCGCCAC 

AATGCAAAAA 

GGGCGATATC 

ACAGCATCAG 

AACGGCACGC 

TGCCGCAACG 

TCAAACAGGA 

AACAGCCTCA 

GCAACCCTTG 

GCGACCTGAA 

ACTGCAGCAA 

GGTTGCCGGC 

TGGCACGTTA 

CGAGTCGGCG 

CGCCACAAAC 

CTGCCTACAA 

ATCTACGACA 

AGCCGATGTT 

TGACTAACCT 

AAAGTAAAAG 

AGACACTGAT 

CCAACGCCTT 

ACTAAGACAA 



CAGGACGCGG 
AGCATATCCT 
AAATTACAAA 
GAGGCGGTAA 
ACTGAAGCAA 
CGATTTGGTC 
CAGGCGGTAT 
GAAACCAAGA 
GCTGGGCGAA 
CGAGGGCAGG 
TACCGCCAAG 
TATCCGCCTG 
GTAATAAAAA 
CAGCCCAACA 
AGGCATTATC 
GGGAAATGTA 
CATTGCGGAA 
CGTCCGTTTC 
CCGCACCCAT 
TGGATGAGCA 
CATCGGTGCA 
CGGGTAAGGC 
GCCGATACGA 
TTCAGGCACG 
ACGGCAACAA 
GTGTTGTACG 
GCTGATTTAT 
TTGTCTATCT 
ATCAAAGGCA 
GGGCAAACTG 
ACATGTCGGC 
CGTGTTCCCT 
CACAAACATC 
AAAAAACAGC 
GGCAATGCGG 
TCTGAAACAC 
TCGAACTGGA 
GCGGCAGCCG 
TTTCCGCGCA 
GCATCTTCAA 
CATGCCGATA 
CCACAACGGC 
GAACGTGGGA 
ACCGTCGGCA 
ACTGGGCATG 
CCGACAGCAT 
GGCTATCTCA 
CCGCAGCACC 
TGATGCAGCT 
GGAGATTTGA 
TGCATTCGCC 
CTGAAGGCAC 
AGCGATAAAG 
CGGACGCGAC 
CCGGCAAGAC 
CTGGGCGCGG 
CAGCTACGCC 
TAGGCTACCG 
GACGACGATG 
CAATGGCCAA 
TTGATGAAGA 
GAAGCCGACG 
GACCAAAACC 
CTGCAGAATC 
GCCGCTTTAG 
GAATAAATTG 
ATATCGTAAA 



GGTAGAGGTA 
TTCCCGAACT 
AACTATACGG 
AGACATTGAA 
AGCCGACGGA 
TCCCATATTA 
TGCGCCCGAT 
ACGAAATGAT 
CGTGGCGTGC 
CACTGCCGAC 
CGTTGCTCGA 
ATGCAACAGA 
CATGCTTTTC 
CATATGCCCT 
ACAGTCGCAG 
TGGAGAACCG 
TTACTGCCAT 
ACCCGTACAA 
CGTAACCGGC 
ACGACAACCT 
GTCGGCGTGG 
CATGAACGGA 
AAGGTACATC 
GGCGGCCTGA 
CACCTATACG 
GCAACAACAA 
AACGGGGCGG 
GGCAGATACC 
GTCTGCAGCT 
CTGAAAGTGG 
ACGCGGCAAG 
TCCTGAGTGC 
GAAACCGACG 
GGGCAGTGAA 
CACGGACTGC 
GCCGTAGAAC 
TGCCTCCGAA 
ACCGCACAGA 
GCGGCAGCCG 
CAGTCTCGCC 
TGCAGGGACG 
ACGGGTCTGC 
ACAGGGCGGT 
TTGCCGCGAA 
GGACGCAGCA 
TAGTCTGTTT 
AAGGCCTGTT 
GGTGCGGACG 
GGGCGCACTG 
CGGTCGAAGG 
GAAAAAGGCA 
GCTGGTCGGA 
CCGTCCTGTT 
TACACGGTAA 
GGGGGCACGC 
ATGTCGAATT 
GGTTCCAAAC 
GTTCCTCGAG 
TTAAAAAAGC 
GAAATCAACG 
CGGCACAATT 
ACTTTAAAGG 
GTCAATGAAA 
TGAAATAGAA 
CAGATACTGA 
GGAGAAAATA 
AATTGATGAA 
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3601 AAATTAGAAG CCGTGGCTGA TACCGTCGAC AAGCATGCCG AAGCATTCAA 

3651 CGATATCGCC GATTCATTGG ATGAAACCAA CACTAAGGCA GACGAAGCCG 

3701 TCAAAACCGC CAATGAAGCC AAACAGACGG CCGAAGAAAC CAAACAAAAC 

3751 GTCGATGCCA AAGTAAAAGC TGCAGAAACT GCAGCAGGCA AAGCCGAAGC 

3801 TGCCGCTGGC ACAGCTAATA CTGCAGCCGA CAAGGCCGAA GCTGTCGCTG 

3851 CAAAAGTTAC CGACATCAAA GCTGATATCG CTACGAACAA AGATAATATT 

3901 GCTAAAAAAG CAAACAGTGC CGACGTGTAC ACCAGAGAAG AGTCTGACAG 

3951 CAAATTTGTC AGAATTGATG GTCTGAACGC TACTACCGAA AAATTGGACA 

4001 CACGCTTGGC TTCTGCTGAA AAATCCATTG CCGATCACGA TACTCGCCTG 

4051 AACGGTTTGG ATAAAACAGT GTCAGACCTG CGCAAAGAAA CCCGCCAAGG 

4101 CCTTGCAGAA CAAGCCGCGC TCTCCGGTCT GTTCCAACCT TACAACGTGG 

4151 GTCTCGAGCA CCACCACCAC CACCACTGA 

1 MTSAPDFNAG GTGIGSNSRA TTAKSAAVSY AGIKNEMCKD RSMLCAGRDD 

51 VAVTDRDAKI NAPPPNLHTG DFPNPNDAYK NLINLKPAIE AGYTGRGVEV 

101 GIVDTGESVG SISFPELYGR KEHGYNENYK NYTAYMRKEA PEDGGGKDIE 

151 ASFDDEAVIE TEAKPTDIRH VKEIGHIDLV SHIIGGRSVD GRPAGGIAPD 

201 ATLHIMNTND ETKNEMMVAA IRNAWVKLGE RGVRIVNNSF GTTSRAGTAD 

251 LFQIANSEEQ YRQALLDYSG GDKTDEGIRL MQQSDYGNLS YHIRNKNMLF 

301 IFSTGNDAQA QPNTYALLPF YEKDAQKGII TVAGVDRSGE KFKREMYGEP 

351 GTEPLEYGSN HCGITAMWCL SAPYEASVRF TRTNPIQIAG TSFSAPIVTG 

401 TAALLLQKYP WMSNDNLRTT LLTTAQDIGA VGVDSKFGWG LLDAGKAMNG 

451 PASFPFGDFT ADTKGTSDIA YSFRNDISGT GGLIKKGGSQ LQLHGNNTYT 

501 GKTIIEGGSL VLYGNNKSDM RVETKGALIY NGAASGGSLN SDGIVYLADT 

551 DQSGANETVH IKGSLQLDGK GTLYTRLGKL LKVDGTAIIG GKLYMSARGK 

601 GAGYLNSTGR RVPFLSAAKI GQDYSFFTNI ETDGGLLASL DSVEKTAGSE 

651 GDTLSYYVRR GNAARTASAA AHSAPAGLKH AVEQGGSNLE NLMVELDASE 

701 SSATPETVET AAADRTDMPG IRPYGATFRA AAAVQHANAA DGVRIFNSLA 

751 ATVYADSTAA HADMQGRRLK AVSDGLDHNG TGLRVIAQTQ QDGGTWEQGG 

801 VEGKMRGSTQ TVGIAAKTGE NTTAAATLGM GRSTWSENSA NAKTDSISLF 

851 AGIRHDAGDI GYLKGLFSYG RYKNSISRST GADEHAEGSV NGTLMQLGAL 

901 GGVNVPFAAT GDLTVEGGLR YDLLKQDAFA EKGSALGWSG NSLTEGTLVG 

951 LAGLKLSQPIi SDKAVLFATA GVERDLNGRD YTVTGGFTGA TAATGKTGAR 

1001 NMPHTRLVAG LGADVEFGNG WNGLARYSYA GSKQYGNHSG RVGVGYRFLE 

1051 GGGGTGSATN DDDVKKAATV AIAAAYNNGQ EINGFKAGET IYDIDEDGTI 

1101 TKKDATAADV EADDFKGLGL KKWTNLTKT VNENKQNVDA KVKAAESEIE 

1151 KLTTKLADTD AALADTDAAL DATTNALNKL GENITTFAEE TKTNIVKIDE 

1201 KLEAVADTVD KHAEAFNDIA DSLDETNTKA DEAVKTANEA KQTAEBTKQN 

1251 VDAKVKAAET AAGKAEAAAG TANTAADKAE AVAAKVTDIK ADIATNKDNI 

1301 AKKANSADVY TREESDSKFV RIDGLNATTE KLDTRLASAE KSIADHDTRL 

1351 NGLDKTVSDL RKETRQGLAE QAALSGLFQP YNVGLEHHHH HH* 

AG741 and hybrids 

Bactericidal titres generated in response to AG741 (His-fusion) were measured against 
various strains, including the homologous 2996 strain: 





2996 


MC58 


NGH38 


F6124 


BZ133 


AG741 


512 


131072 


>2048 


16384 


>2048 



As can be seen, the AG741 -induced anti-bactericidal titre is particularly high against 
heterologous strain MC58. 

AG741 was also fused directly in-frame upstream of proteins 961, 961c, 983 and ORF46.1: 

AG741-961 

1 ATGGTCGCCG CCGACATCGG TGCGGGGCTT GCCGATGCAC TAACCGCACC 
51 GCTCGACCAT AAAGACAAAG GTTTGCAGTC TTTGACGCTG GATCAGTCCG 
101 TCAGGAAAAA CGAGAAACTG AAGCTGGCGG CACAAGGTGC GGAAAAAACT 
151 TATGGAAACG GTGACAGCCT CAATACGGGC AAATTGAAGA ACGACAAGGT 
201 CAGCCGTTTC GACTTTATCC GCCAAATCGA AGTGGACGGG CAGCTCATTA 
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251 
301 
351 
401 
451 
501 
551 
601 
651 
701 
751 
801 
851 
901 
951 
1001 
1051 
1101 
1151 
1201 
1251 
1301 
1351 
1401 
1451 
1501 
1551 
1601 
1651 
1701 
1751 
1801 
1851 
1901 

1 
51 
101 
151 
201 
251 
301 
351 
401 
451 
501 
551 
601 



CCTTGGAGAG 
ACCGCCTTTC 
GGTTGCGAAA 
CTTTTGACAA 
TTCGGTTCAG 
CGCCAAGCAG 
ATGTCGACCT 
GTCATCAGCG 
CCTCGGTATC 
TGAAAACCGT 
GAGGGTGGCG 
AGCTGCCACT 
ACGGTTTCAA 
ATTACCAAAA 
AGGTCTGGGT 
AAAACAAACA 
GAAAAGTTAA 
TGATGCCGCT 
ATATAACGAC 
GAAAAATTAG 
CAACGATATC 
CCGTCAAAAC 
AACGTCGATG 
AGCTGCCGCT 
CTGCAAAAGT 
ATTGCTAAAA 
CAGCAAATTT 
ACACACGCTT 
CTGAACGGTT 
AGGCCTTGCA 
TGGGTCGGTT 
GCAGTCGCCA 
AGCAGGCGTG 
TCGGCGTCAA 

MVAADIGAGL 
YGNGDSLNTG 
TAFQTEQIQD 
FGSDDAGGKL 
VISGSVLYNQ 
EGGGGTGSAT 
ITKKDATAAD 
EKLTTKLADT 
EKLEAVADTV 
NVDAKVKAAE 
IAKKANSADV 
LNGLDKTVSD 
AVAIGTGFRF 



AG741-961C 



1 
51 
101 
151 
201 
251 
301 
351 
401 
451 
501 
551 
601 
651 
701 
751 



ATGGTCGCCG 
GCTCGACCAT 
TCAGGAAAAA 
TATGGAAACG 
CAGCCGTTTC 
CCTTGGAGAG 
ACCGCCTTTC 
GGTTGCGAAA 
CTTTTGACAA 
TTCGGTTCAG 
CGCCAAGCAG 
ATGTCGACCT 
GTCATCAGCG 
CCTCGGTATC 
TGAAAACCGT 
GAGGGTGGCG 



TGGAGAGTTC 
AGACCGAGCA 
CGCCAGTTCA 
GCTTCCCGAA 
ACGATGCCGG 
GGAAACGGCA 
GGCCGCCGCC 
GTTCCGTCCT 
TTTGGCGGAA 
AAACGGCATA 
GAGGCACTGG 
GTGGCCATTG 
AGCTGGAGAG 
AAGACGCAAC 
CTGAAAAAAG 
AAACGTCGAT 
CAACCAAGTT 
CTGGATGCAA 
ATTTGCTGAA 
AAGCCGTGGC 
GCCGATTCAT 
CGCCAATGAA 
CCAAAGTAAA 
GGCACAGCTA 
TACCGACATC 
AAGCAAACAG 
GTCAGAATTG 
GGCTTCTGCT 
TGGATAAAAC 
GAACAAGCCG 
CAATGTAACG 
TCGGTACCGG 
GCAGTCGGCA 
TTACGAGTGG 

ADALTAPLDH 
KLKNDKVSRF 
SEHSGKMVAK 
TYTIDFAAKQ 
AEKGSYSLGI 
NDDDVKKAAT 
VEADDFKGLG 
DAALADTDAA 
DKHAEAFNDI 
TAAGKAEAAA 
YTREESDSKF 
LRKETRQGLA 
TENFAAKAGV 



CCGACATCGG 
AAAGACAAAG 
CGAGAAACTG 
GTGACAGCCT 
GACTTTATCC 
TGGAGAGTTC 
AGACCGAGCA 
CGCCAGTTCA 
GCTTCCCGAA 
ACGATGCCGG 
GGAAACGGCA 
GGCCGCCGCC 
GTTCCGTCCT 
TTTGGCGGAA 
AAACGGCATA 
GAGGCACTGG 



CAAGTATACA 
AATACAAGAT 
GAATCGGCGA 
GGCGGCAGGG 
CGGAAAACTG 
AAATCGAACA 
GATATCAAGC 
TTACAACCAA 
AAGCCCAGGA 
CGCCATATCG 
ATCCGCCACA 
CTGCTGCCTA 
ACCATCTACG 
TGCAGCCGAT 
TCGTGACTAA 
GCCAAAGTAA 
AGCAGACACT 
CCACCAACGC 
GAGACTAAGA 
TGATACCGTC 
TGGATGAAAC 
GCCAAACAGA 
AGCTGCAGAA 
ATACTGCAGC 
AAAGCTGATA 
TGCCGACGTG 
ATGGTCTGAA 
GAAAAATCCA 
AGTGTCAGAC 
CGCTCTCCGG 
GCTGCAGTCG 
CTTCCGCTTT 
CTTCGTCCGG 
CTCGAGCACC 

KDKGLQSLTL 
DFIRQIEVDG 
RQFRIGDIAG 
GNGKIEHLKS 
FGGKAQEVAG 
VAIAAAYNNG 
LKKWTNLTK 
LDATTNALNK 
ADSLDETNTK 
GTANTAADKA 
VRIDGLNATT 
EQAALSGLFQ 
AVGTSSGSSA 



TGCGGGGCTT 
GTTTGCAGTC 
AAGCTGGCGG 
CAATACGGGC 
GCCAAATCGA 
CAAGTATACA 
AATACAAGAT 
GAATCGGCGA 
GGCGGCAGGG 
CGGAAAACTG 
AAATCGAACA 
GATATCAAGC 
TTACAACCAA 
AAGCCCAGGA 
CGCCATATCG 
ATCCGCCACA 



AACAAAGCCA 
TCGGAGCATT 
CATAGCGGGC 
CGACATATCG 
ACCTACACCA 
TTTGAAATCG 
CGGATGGAAA 
GCCGAGAAAG 
AGTTGCCGGC 
GCCTTGCCGC 
AACGACGACG 
CAACAATGGC 
ACATTGATGA 
GTTGAAGCCG 
CCTGACCAAA 
AAGCTGCAGA 
GATGCCGCTT 
CTTGAATAAA 
CAAATATCGT 
GACAAGCATG 
CAACACTAAG 
CGGCCGAAGA 
ACTGCAGCAG 
CGACAAGGCC 
TCGCTACGAA 
TACACCAGAG 
CGCTACTACC 
TTGCCGATCA 
CTGCGCAAAG 
TCTGTTCCAA 
GCGGOTACAA 
ACCGAAAACT 
TTCTTCCGCA 
ACCACCACCA 

DQSVRKNEKL 
QLITLESGEF 
EHTSFDKLPE 
PELNVDLAAA 
SAEVKTVNGI 
QEINGFKAGE 
TVNENKQNVD 
LGENITTFAE 
ADEAVKTANE 
EAVAAKVTDI 
EKLDTRLASA 
PYNVGRFNVT 
AYHVGVNYEW 



GCCGATGCAC 
TTTGACGCTG 
CACAAGGTGC 
AAATTGAAGA 
AGTGGACGGG 
AACAAAGCCA 
TCGGAGCATT 
CATAGCGGGC 
CGACATATCG 
ACCTACACCA 
TTTGAAATCG 
CGGATGGAAA 
GCCGAGAAAG 
AGTTGCCGGC 
GCCTTGCCGC 
AACGACGACG 



TTCCGCCTTA 

CCGGGAAGAT 

GAACATACAT 

CGGGACGGCG 

TAGATTTCGC 

CCAGAACTCA 

ACGCCATGCC 

GCAGTTACTC 

AGCGCGGAAG 

CAAGCAACTC 

ATGTTAAAAA 

CAAGAAATCA 

AGACGGCACA 

ACGACTTTAA 

ACCGTCAATG 

ATCTGAAATA 

TAGCAGATAC 

TTGGGAGAAA 

AAAAATTGAT 

CCGAAGCATT 

GCAGACGAAG 

AACCAAACAA 

GCAAAGCCGA 

GAAGCTGTCG 

CAAAGATAAT 

AAGAGTCTGA 

GAAAAATTGG. 

CGATACTCGC 

AAACCCGCCA 

CCTTACAACG 

ATCCGAATCG 

TTGCCGCCAA 

GCCTACCATG 

CCACTGA 

KLAAQGAEKT 
QVYKQSHSAL 
GGRATYRGTA 
DIKPDGKRHA 
RHIGLAAKQL 
TIYDIDEDGT 
AKVKAAESEI 
ETKTNIVKID 
AKQTAEETKQ 
KADIATNKDN 
EKSIADHDTR 
AAVGGYKSES 
LEHHHHHH* 



TAACCGCACC 
GATCAGTCCG 
GGAAAAAACT 
ACGACAAGGT 
CAGCTCATTA 
TTCCGCCTTA 
CCGGGAAGAT 
GAACATACAT 
CGGGACGGCG 
TAGATTTCGC 
CCAGAACTCA 
ACGCCATGCC 
GCAGTTACTC 
AGCGCGGAAG 
CAAGCAACTC 
ATGTTAAAAA 
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AGCTGCCACT 
ACGGTTTCAA 
ATTACCAAAA 
AGGTCTGGGT 
AAAACAAACA 
GAAAAGTTAA 
TGATGCCGCT 
ATATAACGAC 
GAAAAATTAG 
CAACGATATC 
CCGTCAAAAC 
AACGTCGATG 
AGCTGCCGCT 
CTGCAAAAGT 
ATTGCTAAAA 
CAGCAAATTT 
ACACACGCTT 
CTGAACGGTT 
AGGCCTTGCA 
TGGGTCTCGA 



GTGGCCATTG 
AGCTGGAGAG 
AAGACGCAAC 
CTGAAAAAAG 
AAACGTCGAT 
CAACCAAGTT 
CTGGATGCAA 
ATTTGCTGAA 
AAGCCGTGGC 
GCCGATTCAT 
CGCCAATGAA 
CCAAAGTAAA 
GGCACAGCTA 
TACCGACATC 
AAGCAAACAG 
GTCAGAATTG 
GGCTTCTGCT 
TGGATAAAAC 
GAACAAGCCG 
GCACCACCAC 



CTGCTGCCTA 
ACCATCTACG 
TGCAGCCGAT 
TCGTGACTAA 
GCCAAAGTAA 
AGCAGACACT 
CCACCAACGC 
GAGACTAAGA 
TGATACCGTC 
TGGATGAAAC 
GCCAAACAGA 
AGCTGCAGAA 
ATACTGCAGC 
AAAGCTGATA 
TGCCGACGTG 
ATGGTCTGAA 
GAAAAATCCA 
AGTGTCAGAC 
CGCTCTCCGG 
CACCACCACT 



CAACAATGGC 
ACATTGATGA 
GTTGAAGCCG 
CCTGACCAAA 
AAGCTGCAGA 
GATGCCGCTT 
CTTGAATAAA 
CAAATATCGT 
GACAAGCATG 
CAACACTAAG 
CGGCCGAAGA 
ACTGCAGCAG 
CGACAAGGCC 
TCGCTACGAA 
TACACCAGAG 
CGCTACTACC 
TTGCCGATCA 
CTGCGCAAAG 
TCTGTTCCAA 
GA 



CAAGAAATCA 
AGACGGCACA 
ACGACTTTAA 
ACCGTCAATG 
ATCTGAAATA 
TAGCAGATAC 
TTGGGAGAAA 
AAAAATTGAT 
CCGAAGCATT 
GCAGACGAAG 
AACCAAACAA 
GCAAAGCCGA 
GAAGCTGTCG 
CAAAGATAAT 
AAGAGTCTGA 
GAAAAATTGG 
CGATACTCGC 
AAACCCGCCA 
CCTTACAACG 
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l 

51 
101 
151 
201 
251 
301 
351 
401 
451 
501 
551 



MVAADIGAGL 
YGNGDSLNTG 
TAFQTEQIQD 
FGSDDAGGKL 
VISGSVLYNQ 
EGGGGTGSAT 
ITKKDATAAD 
EKLTTKLADT 
EKLEAVADTV 
NVDAKVKAAE 
IAKKANSADV 
LNGIiDKTVSD 



AQ741-983 



1 
51 
101 
151 
201 
251 
301 
351 
401 
451 
501 
551 
601 
651 
701 
751 
801 
851 
901 
951 
1001 
1051 
1101 
1151 
1201 
1251 
1301 
1351 
1401 
1451 
1501 



ATGGTCGCCG 
GCTCGACCAT 
TCAGGAAAAA 
TATGGAAACG 
CAGCCGTTTC 
CCTTGGAGAG 
ACCGCCTTTC 
GGTTGCGAAA 
CTTTTGACAA 
TTCGGTTCAG 
CGCCAAGCAG 
ATGTCGACCT 
GTCATCAGCG 
CCTCGGTATC 
TGAAAACCGT 
GAGGGATCCG 
TACCGGTATC 
TATCTTACGC 
TGTGCCGGTC 
TGCCCCCCCC 
CATACAAGAA 
GGACGCGGGG 
CATATCCTTT 
ATTACAAAAA 
GGCGGTAAAG 
TGAAGCAAAG 
ATTTGGTCTC 
GGCGGTATTG 
AACCAAGAAC 
TGGGCGAACG 
AGGGCAGGCA 



ADALTAPLDH 
KLKNDKVSRF 
SEHSGKMVAK 
TYTIDFAAKQ 
AEKGSYSLGI 
NDDDVKKAAT 
VEADDFKGLG 
DAALADTDAA 
DKHAEAFNDI 
TAAGKAEAAA 
YTREESDSKF 
LRKETRQGIaA 



CCGACATCGG 
AAAGACAAAG 
CGAGAAACTG 
GTGACAGCCT 
GACTTTATCC 
TGGAGAGTTC 
AGACCGAGCA 
CGCCAGTTCA 
GCTTCCCGAA 
ACGATGCCGG 
GGAAACGGCA 
GGCCGCCGCC 
GTTCCGTCCT 
TTTGGCGGAA 
AAACGGCATA 
GCGGAGGCGG 
GGCAGCAACA 
CGGTATCAAG 
GGGATGACGT 
CCGAATCTGC 
TTTGATCAAC 
TAGAGGTAGG 
CCCGAACTGT 
CTATACGGCG 
ACATTGAAGC 
CCGACGGATA 
CCATATTATT 
CGCCCGATGC 
GAAATGATGG 
TGGCGTGCGC 
CTGCCGACCT 



KDKGLQSIiTIi 
DFIRQIEVDG 
RQFRIGDIAG 
GNGKIEHLKS 
FGGKAQEVAG 
VAIAAAYNNG 
LKKWTNLTK 
LDATTNALNK 
ADSLDETNTK 
GTANTAADKA 
VRIDGLNATT 
EQAALSGLFQ 



TGCGGGGCTT 
GTTTGCAGTC 
AAGCTGGCGG 
CAATACGGGC 
GCCAAATCGA 
CAAGTATACA 
AATACAAGAT 
GAATCGGCGA 
GGCGGCAGGG 
CGGAAAACTG 
AAATCGAACA 
GATATCAAGC 
TTACAACCAA 
AAGCCCAGGA 
CGCCATATCG 
CACTTCTGCG 
GCAGAGCAAC 
AACGAAATGT 
TGCGGTTACA 
ATACCGGAGA 
CTCAAACCTG 
TATCGTCGAC 
ATGGCAGAAA 
TATATGCGGA 
TTCTTTCGAC 
TCCGCCACGT 
GGCGGGCGTT 
GACGCTACAC 
TTGCAGCCAT 
ATCGTCAATA 
TTTCCAAATA 



DQSVRKNEKL 
QLITLESGEF 
EHTSFDKLPE 
PELNVDLAAA 
SAEVKTVNGI 
QEINGFKAGE 
TVNENKQNVD 
LGENITTFAE 
ADEAVKTANE 
EAVAAKVTDI 
EKLDTRLASA 
PYNVGLEHHH 



GCCGATGCAC 
TTTGACGCTG 
CACAAGGTGC 
AAATTGAAGA 
AGTGGACGGG 
AACAAAGCCA 
TCGGAGCATT 
CATAGCGGGC 
CGACATATCG 
ACCTACACCA 
TTTGAAATCG 
CGGATGGAAA 
GCCGAGAAAG 
AGTTGCCGGC 
GCCTTGCCGC 
CCCGACTTCA 
AACAGCGAAA 
GCAAAGACAG 
GACAGGGATG 
CTTTCCAAAC 
CAATTGAAGC 
ACAGGCGAAT 
AGAACACGGC 
AGGAAGCGCC 
GATGAGGCCG 
AAAAGAAATC 
CCGTGGACGG 
ATAATGAATA 
CCGCAATGCA 
ACAGTTTTGG 
GCCAATTCGG 



KLAAQGAEKT 
QVYKQSHSAL 
GGRATYRGTA 
DIKPDGKRHA 
RHIGLAAKQL 
TIYDIDEDGT 
AKVKAAESEI 
ETKTNXVKID 
AKQTAEETKQ 
KADIATNKCN 
EKSIADHDTR 
HHH* 



TAACCGCACC 
GATCAGTCCG 
GGAAAAAACT 
ACGACAAGGT 
CAGCTCATTA 
TTCCGCCTTA 
CCGGGAAGAT 
GAACATACAT 
CGGGACGGCG 
TAGATTTCGC 
CCAGAACTCA 
ACGCCATGCC 
GCAGTTACTC 
AGCGCGGAAG 
CAAGCAACTC 
ATGCAGGCGG 
TCAGCAGCAG 
AAGCATGCTC 
CCAAAATCAA 
CCAAATGACG 
AGGCTATACA 
CCGTCGGCAG 
TATAACGAAA 
TGAAGACGGA 
TTATAGAGAC 
GGACACATCG 
CAGACCTGCA 
CGAATGATGA 
TGGGTCAAGC 
AACAACATCG 
AGGAGCAGTA 



WO 01/64922 



-52- 



PCT/IB01/00452 i* 



1551 CCGCCAAGCG TTGCTCGACT ATTCCGGCGG TGATAAAACA GACGAGGGTA 

1601 TCCGCCTGAT GCAACAGAGC GATTACGGCA ACCTGTCCTA CCACATCCGT 

1651 AATAAAAACA TGCTTTTCAT CTTTTCGACA GGCAATGACG CACAAGCTCA 

1701 GCCCAACACA TATGCCCTAT TGCCATTTTA TGAAAAAGAC GCTCAAAAAG 

5 1751 GCATTATCAC AGTCGCAGGC GTAGACCGCA GTGGAGAAAA GTTCAAACGG 

1801 - GAAATGTATG GAGAACCGGG TACAGAACCG CTTGAGTATG GCTCCAACCA 

1851 TTGCGGAATT ACTGCCATGT GGTGCCTGTC GGCACCCTAT GAAGCAAGCG 

1901 TCCGTTTCAC CCGTACAAAC CCGATTCAAA TTGCCGGAAC ATCCTTTTCC 

1951 GCACCCATCG TAACCGGCAC GGCGGCTCTG CTGCTGCAGA AATACCCGTG 

10 2001 GATGAGCAAC GACAACCTGC GTACCACGTT GCTGACGACG GCTCAGGACA 

2051 TCGGTGCAGT CGGCGTGGAC AGCAAGTTCG GCTGGGGACT GCTGGATGCG 

2101 GGTAAGGCCA TGAACGGACC CGCGTCCTTT CCGTTCGGCG ACTTTACCGC 

2151 CGATACGAAA GGTACATCCG ATATTGCCTA CTCCTTCCGT AACGACATTT 

2201 CAGGCACGGG CGGCCTGATC AAAAAAGGCG GCAGCCAACT GCAACTGCAC 

15 2251 GGCAACAACA CCTATACGGG CAAAACCATT ATCGAAGGCG GTTCGCTGGT 

2301 GTTGTACGGC AACAACAAAT CGGATATGCG CGTCGAAACC AAAGGTGCGC 

2351 TGATTTATAA CGGGGCGGCA TCCGGCGGCA GCCTGAACAG CGACGGCATT 

2401 GTCTATCTGG CAGATACCGA CCAATCCGGC GCAAACGAAA CCGTACACAT 

2451 CAAAGGCAGT CTGCAGCTGG ACGGCAAAGG TACGCTGTAC ACACGTTTGG 

20 2501 GCAAACTGCT GAAAGTGGAC GGTACGGCGA TTATCGGCGG CAAGCTGTAC 

2551 ATGTCGGCAC GCGGCAAGGG GGCAGGCTAT CTCAACAGTA CCGGACGACG 

2601 TGTTCCCTTC CTGAGTGCCG CCAAAATCGG GCAGGATTAT TCTTTCTTCA 

2651 CAAACATCGA AACCGACGGC GGCCTGCTGG CTTCCCTCGA CAGCGTCGAA 

2701 AAAACAGCGG GCAGTGAAGG CGACACGCTG TCCTATTATG TCCGTCGCGG 

25 2751 CAATGCGGCA CGGACTGCTT CGGCAGCGGC ACATTCCGCG CCCGCCGGTC 

2801 TGAAACACGC CGTAGAACAG GGCGGCAGCA ATCTGGAAAA CCTGATGGTC 

2851 GAACTGGATG CCTCCGAATC ATCCGCAACA CCCGAGACGG TTGAAACTGC 

2901 GGCAGCCGAC CGCACAGATA TGCCGGGCAT CCGCCCCTAC GGCGCAACTT 

2951 TCCGCGCAGC GGCAGCCGTA CAGCATGCGA ATGCCGCCGA CGGTGTACGC 

30 3001 ATCTTCAACA GTCTCGCCGC TACCGTCTAT GCCGACAGTA CCGCCGCCCA 

3051 TGCCGATATG CAGGGACGCC GCCTGAAAGC CGTATCGGAC GGGTTGGACC 

3101 ACAACGGCAC GGGTCTGCGC GTCATCGCGC AAACCCAACA GGACGGTGGA 

3151 ACGTGGGAAC AGGGCGGTGT TGAAGGCAAA ATGCGCGGCA GTACCCAAAC 

3201 CGTCGGCATT GCCGCGAAAA CCGGCGAAAA TACGACAGCA GCCGCCACAC 

35 3251 TGGGCATGGG ACGCAGCACA TGGAGCGAAA ACAGTGCAAA TGCAAAAACC 

3301 . GACAGCATTA GTCTGTTTGC AGGCATACGG CACGATGCGG GCGATATCGG 

3351 CTATCTCAAA GGCCTGTTCT CCTACGGACG CTACAAAAAC AGCATCAGCC 

3401 GCAGCACCGG TGCGGACGAA CATGCGGAAG GCAGCGTCAA CGGCACGCTG 

3451 ATGCAGCTGG GCGCACTGGG CGGTGTCAAC GTTCCGTTTG CCGCAACGGG 

40 3501 AGATTTGACG GTCGAAGGCG GTCTGCGCTA CGACCTGCTC AAACAGGATG 

3551 CATTCGCCGA AAAAGGCAGT GCTTTGGGCT GGAGCGGCAA CAGCCTCACT 

3601 GAAGGCACGC TGGTCGGACT CGCGGGTCTG AAGCTGTCGC AACCCTTGAG > 

3651 CGATAAAGCC GTCCTGTTTG CAACGGCGGG CGTGGAACGC GACCTGAACG : ~ 

3701 GACGCGACTA CACGGTAACG GGCGGCTTTA CCGGCGCGAC TGCAGCAACC T 

45 3751 GGCAAGACGG GGGCACGCAA TATGCCGCAC ACCCGTCTGG TTGCCGGCCT 

3801 GGGCGCGGAT GTCGAATTCG GCAACGGCTG GAACGGCTTG GCACGTTACA 

3851 GCTACGCCGG TTCCAAACAG TACGGCAACC ACAGCGGACG AGTCGGCGTA 

3901 GGCTACCGGT TCCTCGAGCA CCACCACCAC CACCACTGA 

50 1 MVAADIGAGL ADALTAPLDH KDKGLQSLTL DQSVRKNEKL KLAAQGAEKT 

51 YGNGDSLNTG KLKNDKVSRF DFIRQIEVDG QLITLESGEF QVYKQSHSAL 

101 TAFQTEQIQD SEHSGKMVAK RQFRIGDIAG EHTSFDKLPE GGRATYRGTA 

151 FGSDDAGGKL TYTIDFAAKQ GNGKIEHLKS PELNVDLAAA DIKPDGKRHA 

201 VISGSVLYNQ AEKGSYSLGI FGGKAQEVAG SAEVKTVNGI RHIGLAAKQL 

55 251 EGSGGGGTSA PDFNAGGTGI GSNSRATTAK SAAVSYAGIK NEMCKDRSML 

301 CAGRDDVAVT DRDAKINAPP PNLHTGDFPN PNDAYKNLIN LKPAIEAGYT 

351 GRGVEVGIVD TGESVGSISF PELYGRKEHG YNENYKNYTA YMRKEAPEDG 

401 GGKDIEASFD DEAVIETEAK PTDIRHVKEI GHIDLVSHII GGRSVDGRPA 

451 GGIAPDATLH IMNTNDETKN EMMVAAIRNA WVKLGERGVR IVNNSFGTTS 

60 501 RAGTADLFQI ANSEEQYRQA LLDYSGGDKT DEGIRLMQQS DYGNLSYHIR 

551 NKNMLFIFST GNDAQAQPNT YALLPFYEKD AQKGIITVAG VDRSGEKFKR 

601 EMYGEPGTEP LEYGSNHCGI TAMWCLSAPY EASVRFTRTN PIQIAGTSFS 

651 APIVTGTAAL LLQKYPWMSN DNLRTTLLTT AQDIGAVGVD SKFGWGLLDA 

701 GKAMNGPASF PFGDFTADTK GTSDIAYSFR NDISGTGGLI KKGGSQLQLH 

65 751 GNNTYTGKTI IEGGSLVLYG NNKSDMRVET KGALIYNGAA SGGSLNSDGI 

801 VYLADTDQSG ANETVHIKGS LQLDGKGTLY TRLGKLLKVD GTAIIGGKLY 

851 MSARGKGAGY LNSTGRRVPF LSAAKIGQDY SFFTNIETDG GLLASLDSVE 
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901 KTAGSEGDTL SYYVRRGNAA RTASAAAHSA PAGLKHAVEQ GGSNLENLMV 

951 ELDASESSAT PETVETAAAD RTDMPGIRPY GATFRAAAAV QHANAADGVR 

1001 IFNSLAATVY ADSTAAHADM QGRRLKAVSD GLDHNGTGLR VIAQTQQDGG 

1051 TWEQGGVEGK MRGSTQTVGI AAKTGENTTA AATLGMGRST WSENSANAKT 

1101 DSISLFAGIR HDAGDIGYLK GLFSYGRYKN SISRSTGADE HAEGSVNGTL 

1151 MQLGALGGVN VPFAATGDLT VEGGLRYDLL KQDAFAEKGS ALGWSGNSLT 

1201 EGTLVGLAGL KLSQPLSDKA VLFATAGVER DLNGRDYTVT GGFTGATAAT 

12 51 GKTGARNMPH TRLVAGLGAD VEFGNGWNGL ARYSYAGSKQ YGNHSGRVGV 

1301 GYRFLEHHHH HH* 

AG741-ORF46.1 

1 ATGGTCGCCG CCGACATCGG TGCGGGGCTT GCCGATGCAC TAACCGCACC 

51 GCTCGACCAT AAAGACAAAG GTTTGCAGTC TTTGACGCTG GATCAGTCCG 

101 TCAGGAAAAA CGAGAAACTG AAGCTGGCGG CACAAGGTGC GGAAAAAACT 

151 TATGGAAACG GTGACAGCCT CAATACGGGC AAATTGAAGA ACGACAAGGT 

201 CAGCCGTTTC GACTTTATCC GCCAAATCGA AGTGGACGGG CAGCTCATTA 

251 CCTTGGAGAG TGGAGAGTTC CAAGTATACA AACAAAGCCA TTCCGCCTTA 

301 ACCGCCTTTC AGACCGAGCA AATACAAGAT TCGGAGCATT CCGGGAAGAT 

351 GGTTGCGAAA CGCCAGTTCA GAATCGGCGA CATAGCGGGC GAACATACAT 

401 CTTTTGACAA GCTTCCCGAA GGCGGCAGGG CGACATATCG CGGGACGGCG 

451 TTCGGTTCAG ACGATGCCGG CGGAAAACTG ACCTACACCA TAGATTTCGC 

501 CGCCAAGCAG GGAAACGGCA AAATCGAACA TTTGAAATCG CCAGAACTCA 

551 ATGTCGACCT GGCCGCCGCC GATATCAAGC CGGATGGAAA ACGCCATGCC 

601 GTCATCAGCG GTTCCGTCCT TTACAACCAA GCCGAGAAAG GCAGTTACTC 

651 CCTCGGTATC TTTGGCGGAA AAGCCCAGGA AGTTGCCGGC AGCGCGGAAG 

701 TGAAAACCGT AAACGGCATA CGCCATATCG GCCTTGCCGC CAAGCAACTC 

751 GACGGTGGCG GAGGCACTGG ATCCTCAGAT TTGGCAAACG ATTCTTTTAT 

801 CCGGCAGGTT CTCGACCGTC AGCATTTCGA ACCCGACGGG AAATACCACC 

851 TATTCGGCAG CAGGGGGGAA CTTGCCGAGC GCAGCGGCCA TATCGGATTG 

901 GGAAAAATAC AAAGCCATCA GTTGGGCAAC CTGATGATTC AACAGGCGGC 

951 CATTAAAGGA AATATCGGCT ACATTGTCCG CTTTTCCGAT CACGGGCACG 

1001 AAGTCCATTC CCCCTTCGAC AACCATGCCT CACATTCCGA TTCTGATGAA 

1051 GCCGGTAGTC CCGTTGACGG ATTTAGCCTT TACCGCATCC ATTGGGACGG 

1101 ATACGAACAC CATCCCGCCG ACGGCTATGA CGGGCCACAG GGCGGCGGCT 

1151 ATCCCGCTCC CAAAGGCGCG AGGGATATAT ACAGCTACGA CATAAAAGGC 

1201 GTTGCCCAAA ATATCCGCCT CAACCTGACC GACAACCGCA GCACCGGACA 

1251 ACGGCTTGCC GACCGTTTCC ACAATGCCGG TAGTATGCTG ACGCAAGGAG 

1301 TAGGCGACGG ATTCAAACGC GCCACCCGAT ACAGCCCCGA GCTGGACAGA 

1351 TCGGGCAATG CCGCCGAAGC CTTCAACGGC ACTGCAGATA TCGTTAAAAA 

1401 CATCATCGGC GCGGCAGGAG AAATTGTCGG CGCAGGCGAT GCCGTGCAGG 

1451 GCATAAGCGA AGGCTCAAAC ATTGCTGTCA TGCACGGCTT GGGTCTGCTT 

1501 TCCACCGAAA ACAAGATGGC GCGCATCAAC GATTTGGCAG ATATGGCGCA 

1551 ACTCAAAGAC TATGCCGCAG CAGCCATCCG CGATTGGGCA GTCCAAAACC 

1601 CCAATGCCGC ACAAGGCATA GAAGCCGTCA GCAATATCTT TATGGCAGCC 

1651 ATCCCCATCA AAGGGATTGG AGCTGTTCGG GGAAAATACG GCTTGGGCGG 

1701 CATCACGGCA CATCCTATCA AGCGGTCGCA GATGGGCGCG ATCGCATTGC 

1751 CGAAAGGGAA ATCCGCCGTC AGCGACAATT TTGCCGATGC GGCATACGCC 

1801 AAATACCCGT CCCCTTACCA TTCCCGAAAT ATCCGTTCAA ACTTGGAGCA 

1851 GCGTTACGGC AAAGAAAACA TCACCTCCTC AACCGTGCCG CCGTCAAACG 

1901 GCAAAAATGT CAAACTGGCA GACCAACGCC ACCCGAAGAC AGGCGTACCG 

1951 TTTGACGGTA AAGGGTTTCC GAATTTTGAG AAGCACGTGA AATATGATAC 

2001 GCTCGAGCAC CACCACCACC ACCACTGA 

1 MVAADIGAGL ADALTAPLDH KDKGLQSLTL DQSVRKNEKL KLAAQGAEKT 

51 YGNGDSLNTG KLKNDKVSRF DFIRQIEVDG QLITLESGEF QVYKQSHSAL 

101 TAFQTEQIQD SEHSGKMVAK RQFRIGDIAG EHTSFDKLPE GGRATYRGTA 

151 FGSDDAGGKIi TYTIDFAAKQ GNGKIEHLKS PELNVDLAAA DIKPDGKKHA 

201 VISGSVLYNQ AEKGSYSLGI FGGKAQEVAG SAEVKTVNGI RHIGLAAKQL 

251 DGGGGTGSSD LANDSFIRQV LDRQHFEPDG KYHLFGSRGE LAERSGHIGL 

301 GKIQSHQLGN LMIQQAAIKG NIGYIVRFSD HGHEVHSPFD NHASHSDSDE 

351 AGSPVDGFSL YRIHWDGYEH HPADGYDGPQ GGGYPAPKGA RDIYSYDIKG 

401 VAQNIRLNLT DNRSTGQRLA DRFHNAGSML TQGVGDGFKR ATRYSPELDR 

451 SGNAAEAFNG TADIVKNIIG AAGEIVGAGD AVQGISEGSN IAVMHGLGLL 

501 STENKMARIN DLADMAQLKD YAAAAIRDWA VQNPNAAQGI EAVSNIFMAA 

551 IPIKGIGAVR GKYGLGGITA HPIKRSQMGA IALPKGKSAV SDNFADAAYA 

601 KYPSPYHSRN IRSNLEQRYG KENITSSTVP PSNGKNVKLA DQRHPKTGVP 

651 FDGKGFPNFE KHVKYDTLEH HHHHH* 
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Example 16 -C-terminal fusions ('hybrids') with 287/AG287 

According to the invention, hybrids of two proteins A & B may be either NHr-A-B-COOH 
or NH2-B-A-COOH. The effect of this difference was investigated using protein 287 either 
C-terminal (in '287-His' form) or N-terminal (in AG287 form - sequences shown above) to 
5 919, 953 and ORF46.1. A panel of strains was used, including homologous strain 2996. FCA 
was used as adjuvant: 





287 & 919 


287 & 953 


287&ORF46.1 


Strain 


AG287-919 


9/9-287 


AG287-953 


953-287 


AG2S7-46.1 


46.1-287 


2996 


128000 


16000 


65536 


8192 


16384 


8192 


BZ232 


256 


128 


128 


<4 


<4 


<4 


1000 


2048 


<4 


<4 


<4 


<4 


<4 


MC58 


8192 


1024 


16384 


1024 


512 


128 


NGH38 


32000 


2048 


>2048 


4096 


16384 


4096 


394/98 


4096 


32 


256 


128 


128 


16 


MenA (F6124) 


32000 


2048 


>2048 


32 


8192 


1024 


MenC (BZ133) 


64000 


>8192 


>8192 


<16 


8192 


2048 



Better bactericidal titres are generally seen with 287 at the N-terminus (in the AG form) 



When fused to protein 961 [NH 2 -AG287-961-COOH - sequence shown above], the resulting 
protein is insoluble and must be denatured and renatured for purification. Following 
10 renaturation, around 50% of the protein was found to remain insoluble. The soluble, and 
insoluble proteins were compared, and much better bactericidal titres were obtained with the 
soluble protein (FCA as adjuvant): 





2996 


BZ232 


MC58 


NGH38 


F6124 


BZ133 


Soluble 


65536 


128 


4096 


>2048 


>2048 


4096 


Insoluble 


8192 


<4 


<4 


16 


n.d. 


n.d. 



Titres with the insoluble form were, however, improved by using alum adjuvant instead: 



Insoluble 


32768 


128 


4096 


>2048 


>2048 


2048 



Example 17 - N-terminal fusions ('hybrids') to 287 

Expression of protein 287 as full-length with a C-terminal His-tag, or without its leader 
peptide but with a C-terminal His-tag, gives fairly low expression levels. Better expression is 
achieved using a N-terminal GST-fusion. 
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As an alternative to using GST as an N-tenninal fusion partner, 287 was placed at the 
C-terminus of protein 919 C919-287'), of protein 953 ('953-287'), and of proteins ORF46.1 
('ORF46. 1-287'). In both cases, the leader peptides were deleted, and the hybrids were direct 
in-frame fusions. 

5 To generate the 953-287 hybrid, the leader peptides of the two proteins were omitted by 
designing the forward primer downstream from the leader of each sequence; the stop codon 
sequence was omitted in the 953 reverse primer but included in the 287 reverse primer. For 
the 953 gene, the 5' and the 3' primers used for amplification included a Ndel and a BarnHl 
restriction sites respectively, whereas for the amplification of the 287 gene the 5' and the 3' 
10 primers included a BamYH and a Xhol restriction sites respectively. In this way a sequential 
directional cloning of the two genes in pET21b+, using Ndel-BamHl (to clone the first gene) 
and subsequently BamHL-Xhol (to clone the second gene) could be achieved. 

The 919-287 hybrid was obtained by cloning the sequence coding for the mature portion of 
287 into the Xhol site at the 3'-end of the 919-His clone in pET21b+. The primers used for 
15 amplification of the 287 gene were designed for introducing a Sail restriction site at the 5*- 
and a Xhol site at the 3'- of the PCR fragment. Since the cohesive ends produced by the Sail 
and Xhol restriction enzymes are compatible, the 287 PCR product digested with Sall-Xhol 
could be inserted in the pET21b-919 clone cleaved with Xhol. 

The ORF46. 1-287 hybrid was obtained similarly. 

20 The bactericidal efficacy (homologous strain) of antibodies raised against the hybrid proteins 
was compared with antibodies raised against simple mixtures of the component antigens: 





Mixture with 287 


Hybrid with 287 


919 


32000 


16000 


953 


8192 


8192 


ORF46.1 


128 


8192 



Data for bactericidal activity against heterologous MenB strains and against serotypes A and 
C were also obtained for 919-287 and 953-287: 
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919 


953 


ORF46.1 


Strain 


Mixture 


tlyOTlCL 


Mixture 


Hybrid 


Mixture 


Hybrid 


MC58 


01Z 






11)24 




1024 


NGH38 


1024 


2048 


2048 


409fi 

*tv7U 




a no fs 


BZ232 


512 


128 


1024 


16 






MenA (F6124) 


512 


2048 


2048 


32 




1024 


MenC (Cll) 


>2048 


n.d. 


>2048 


n.d. 




n.d. 


MenC (BZ133) 


>4096 


>8192 


>4096 


<16 




2048 



Hybrids of ORF46.1 and 919 were also constructed. Best results (four-fold higher titre) were 
achieved with 919 at the N-terminus. 

Hybrids 919-5 19His, ORF97-225His and 225-ORF97His were also tested. These gave 
moderate ELISA fitres and bactericidal antibody responses. 

Example 18 - the leader peptide from ORF4 

As shown above, the leader peptide of ORF4 can be fused to the mature sequence of other 
proteins {e.g. proteins 287 and 919). It is able to direct lipidation in Kcoli. 

Example 19 - domains in 564 

The protein *564' is very large (2073aa), and it is difficult to clone and express it in complete 
form. To facilitate expression, the protein has been divided into four domains, as shown in 
figure 8 (according to the MC58 sequence): 



Domain 


A 


B 


C 


D 


Amino Acids 


79-360 


361-731 


732-2044 


2045-2073 



These domains show the following homologies: 

• Domain A shows homology to other bacterial toxins: 

gb|AAG03431.l|AE004443_9probable hemagglutinin [Pseudomonas aeruginosa] (38%) 
gb|AAC31981.l| (139897) HecA [Pectobacterium chrysanthemi] (45%) 
emb|CAA36409.l| (X52156) filamentous hemagglutinin [Bordetella pertussis] (31%) 



gb 

gb 



AAC79757.il (AF057695) large supernatant proteinl [Haemophilus ducreyi] (26%) 
AAA25657.1| (M30186) HpmA precursor [Proteus mirabi lis] (29%) 



•Domain B shows no homology, and is specific to 564. 
• Domain C shows homology to: 

gb|AAF84995.l|AE004032 HA-like secreted protein [Xylella fastidiosa] (33%) 
gb|AAG05850.l|AE004673 hypothetical protein [Pseudomonas aeruginosa] (27%) 
gb|AAF68414.1AF237928 putative FHA [Pasteurella multocisida] (23%) 
gb|AAC79757.1| (AF057695) large supernatant proteinl [Haemophilus ducreyi] (23%) 
pir||S21010 FHA B precursor [Bordetella pertussis] (20%) 
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•Domain D shows homology to other bacterial toxins: 

gb|AAF84995.l|AE004032_14 HA-like secreted protein [Xylella fastidiosa] (29%) 

Using the MC58 strain sequence, good intracellular expression of 564ab was obtained in the 
5 form of GST-fusions (no purification) and his-tagged protein; this domain-pair was also 
expressed as a lipoprotein, which showed moderate expression in the outer membrane/ 
supernatant fraction. 

The b domain showed moderate intracellular expression when expressed as a his-tagged 
product (no purification), and good expression as a GST-fusion. 

10 The c domain showed good intracellular expression as a GST-fusion, but was insoluble. The 
d domain showed moderate intracellular expression as a his-tagged product (no purification). 
The cd protein domain-pair showed moderate intracellular expression (no purification) as a 
GST-fusion. 

Good bactericidal assay titres were observed using the c domain and the be pair. 

15 Example 20 - the 919 leader peptide 

The 20mer leader peptide from 919 is discussed in example 1 above: 

MKKYLFRAAL YGIAAAILAA 

As shown in example 1, deletion of this leader improves heterologous expression, as does 
20 substitution with the ORF4 leader peptide. The influence of the 919 leader on expression 
was investigated by fusing the coding sequence to the PhoC reporter gene from Morganella 
morganii [Thaller et al (1994) Microbiology 140:1341-1350]. The construct was cloned in 
the pET21-b plasmid between the Ndel and Xhol sites (Figure 9): 

1 MKKYLFRAAL YGIAAAILAA AIPAGNDATT KPDLYYLKNE QAIDSLKLLP 

25 51 PPPEVGSIQF LNDQAMYEKG RMLRNTERGK QAQADADLAA GGVATAFSGA 

101 FGYPITEKDS PELYKLLTNM IEDAGDLATR SAKEHYMRIR PFAFYGTETC 

151 NTKDQKKLST NGSYPSGHTS IGWATALVLA EVNPANQDAI LERGYQLGQS 

201 RVICGYHWQS DVDAARIVGS AAVATLHSDP AFQAQLAKAK QEFAQKSQK* 

30 The level of expression of PhoC from this plasmid is >200-fold lower than that found for the 
same construct but containing the native PhoC signal peptide. The same result was obtained 
even after substitution of the T7 promoter with the E.coli Plac promoter. This means that the 
influence of the 919 leader sequence on expression does not depend on the promoter used. 

In order to investigate if the results observed were due to some peculiarity of the 919 signal 
35 peptide nucleotide sequence (secondary structure formation, sensitivity to RNAases, etc.) or 
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to protein instability induced by the presence of this signal peptide, a number of mutants 
were generated. The approach used was a substitution of nucleotides of the 919 signal 
peptide sequence by cloning synthetic linkers containing degenerate codons. In this way, 
mutants were obtained with nucleotide and/or amino acid substitutions. 

5 Two different linkers were used, designed to produce mutations in two different regions of 
the 919 signal peptide sequence, in the first 19 base pairs (LI) and between bases 20-36 (SI). 

Ll: 5' T ATG AAa/g TAc/t c/tTN TTt/c a/cGC GCC GCC CTG TAC GGC ATC GCC GCC 
GCC ATC CTC GCC GCC GCG ATC CC 3' 

Sit 5' T ATG AAA AAA TAC CTA TTC CGa/g GCN GCN c/tTa/g TAc/t GGc/g ATC GCC 
10 GCC GCC ATC CTC GCC GCC GCG ATC CC 3' 

The alignment of some of the mutants obtained is given below. 

£il mutants: 

9Ll-a ATGAAGAAGTACCTTTTCAGCGCCGCC - 

15 9Ll-e ATGAAAAAATACTTTTTCCGCGCCGCC 

9Ll-d ATGAAAAAATACTTTTTCCGCGCCGCC . 



9Ll-f ATGAAAAAATATCTCTTTAGCGCCGCCCTGTACGGCATCGCCGCCGCCATCCTCGCCGCC 

9 1 9 sp ATGAAAAAATACCTATTCCGCGCCGCCCTGTACGGCATCGCCGCCGCCATCCTCGCCGCC 

20 9Lla MKKYLFSAA 

9Lle MKKYFFRAA 

9Lld MKKYFFRAA 

9Llf MKKYLFSAALYGIAAAILAA 

919sp MKKYLFRAALYGIAAAIIAA (i.e. native signal peptide) 

25 

SI mutants: 

9Sl-e ATGAAAAAATACCTATTC ATCGCCGCCGCCATCCTCGCCGCC 

9S1-C ATGAAAAAATACCTATTCCGAGCTGCCCAATACGGCATCGCCGCCGCCATCCTCGCCGCC 

9Sl-b ATGAAAAAATACCTATTCCGGGCCGCCCAATACGGCATCGCCGCCGCCATCCTCGCCGCC 

30 9Sl-i ATGAAAAAATACCTATTCCGGGCGGCTTTGTACGGGATCGCCGCCGCCATCCTCGCCGCC 

9 1 9 sp ATGAAAAAATACCTATTCCGCGCCGCCCTGTACGGCATCGCCGCCGCCATCCTCGCCGCC 



9Sle MKKYLF IAAAXLAA 

9S1C MKKYLFRAAQYGIAAAILAA 

35 9Slb MKKYLFRAAQYGIAAAILAA 

9Sli MKKYLFRAALYGI AAAI LAA 

919sp MKKYLFRAALYGIAAAILAA 



40 As shown in the sequences alignments, most of the mutants analysed contain in-frame 
deletions which were unexpectedly produced by the host cells. 

Selection of the mutants was performed by transforming E. coli BL21(DE3) cells with DNA 
prepared from a mixture of Ll and SI mutated clones. Single transformants were screened 
for high PhoC activity by streaking them onto LB plates containing 100 jig/ml ampicillin, 
45 50fig/ml methyl green, 1 mg/ml PDP (phenolphthaleindiphosphate). On this medium PhoC- 
producing cells become green (Figure 10). 



WO 01/64922 



-59- 



PCT/IB01/00452 

» 



A quantitative analysis of PhoC produced by these mutants was carried out in liquid medium 
using pNPP as a substrate for PhoC activity. The specific activities measured in cell extracts 
and supernatants of mutants grown in liquid medium for 0, 30, 90, 180 min. were: 

CELL EXTRACTS 





0 


30 


90 


180 


control 


0,00 


0,00 


0,00 


0,00 


9phoC 


1,11 


1.11 


3,33 


4,44 


9S1e 


102,12 


111,00 


149,85 


172,05 


9L1a 


206,46 


111,00 


94,35 


83,25 


9L1d 


5,11 


4,77 


4,00 


3,11 


9L1f 


27,75 


94,35 


82,14 


36,63 


9S1b 


156,51 


111,00 


72,15 


28,86 


9S1c 


72,15 


33,30 


21,09 


14,43 


9S1i 


156,51 


83,25 


55,50 


26,64 


phoCwt 


194,25 


180,93 


149,85 


142,08 



SUPERNATANTS 





0 


30| 


90 


180 


control 


0,00 


0,00 


0,00 


0,00 


9phoC 


0,33 


0,00 


0,00 


0,00 


9S1e 


0,11 


0,22 


0,44 


0,89 


9L1a 


4,88 


5,99 


5,99 


7,22 


9L1d 


0,11 


0,11 


0,11 


0,11 


9L1f 


0,11 


0,22 


0,11 


0,11 


9S1b 


1,44 


1,44 


1,44 


1,67 


9S1c 


0,44 


0,78 


0,56 


0,67 


9S1i 


0,22 


0,44 


0,22 


0,78 


phoCwt 


34,41 


43,29 


87,69 


177,60 



Some of the mutants produce high amounts of PhoC and in particular, mutant 9Lla can 
secrete PhoC in the culture medium. This is noteworthy since the signal peptide sequence of 
10 this mutant is only 9 amino acids long. This is the shortest signal peptide described to date. 



Example 21 - C-terminal deletions of Maf- related proteins 
MafB-related proteins include 730, ORF46 and ORF29. 

The 730 protein from MC58 has the following sequence: 

VKPLRRLTNL LAACAVAAAA LIQPALAA DL AQDPFITDNA QRQHYEPGGK 
YHLFGDPRGS VSDRTGKINV IQDYTHQMGN LLIQQANING TIGYHTRFSG 
HGHEEHAPFD NHAADSASEE KGNVDEGFTV YRLNWEGHEH HPADAYDGPK 
GGNYPKPTGA RDEYTYHVNG TARSIKLNPT DTRSIRQRIS DNYSNLGSNF 
SDRADEANRK MFEHNAKLDR WGNSMEFING VAAGALNPFI SAGEALGIGD 
ILYGTRYAID KAAMRNIAPL PAEGKFAVTG GLGSVAGFEK NTREAVDRWI 
QENPNAAETV EAVFNVAAAA KVAKLAKAAK PGKAAVSGDF ADSYKKKLAL 



15 



20 



i 

51 
101 
151 
201 
251 
301 
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351 SDSARQLYQN AKYREALDIH YEDLIRRKTD GSSKFINGRE IDAVTNDALI 
4 °1 QAKRTISAID KPKNFLNQKN RKQIKATIEA . ANQQGKRAEF WFKYGVHSQV - £ 
451 KSYIESKGGI VKTGLGD* 

5 The leader peptide is underlined. 

730 shows similar features to ORF46 (see example 8 above): 

- as for Orf46, the conservation of the 730 sequence among MenB, MenA and gonococcus 
is high (>80%) only for the N-terminal portion. The C-terminus, from -340, is highly 
divergent. 

10 - its predicted secondary structure contains a hydrophobic segment spanning the central 
region of the molecule (aa. 227-247). 

- expression of the full-length gene in E. coli gives very low yields of protein. Expression 
from tagged or untagged constructs where the signal peptide sequence has been omitted 
has a toxic effect on the host cells. In other words, the presence of the full-length mature 

15 protein in the cytoplasm is highly toxic for the host cell while its translocation to the 
periplasm (mediated by the signal peptide) has no detectable effect on cell viability. This 
"intracellular toxicity" of 730 is particularly high since clones for expression of the 
leaderless 730 can only be obtained at very low frequency using a recA genetic 
background (E. coli strains: HB101 for cloning; HMS174(DE3) for expression). 

20 To overcome this toxicity, a similar approach was used for 730 as described in example 8 for 
ORF46. Four C-terminal truncated forms were obtained, each of which is well expressed. All 
were obtained from intracellular expression of His-tagged leaderless 730. 

Form A consists of the N-terminal hydrophilic region of the mature protein (aa. 28-226). 
This was purified as a soluble His-tagged product, having a higher-than-expected MW. 

25 Form B extends to the end of the region conserved between serogroups (aa. 28-340). This 
was purified as an insoluble His-tagged product. 

The C-terminal truncated forms named CI and C2 were obtained after screening for clones 
expressing high levels of 730-His clones in strain HMS174(DE3). Briefly, the pET21b 
plasmid containing the His-tagged sequence coding for the full-length mature 730 protein 
30 was used to transform the recA strain HMS174(DE3). Transformants were obtained at low 
frequency which showed two phenotypes: large colonies and very small colonies. Several 
large and small colonies were analysed for expression of the 730-His clone. Only cells from 
large colonies over-expressed a protein recognised by anti-730A antibodies. However the 
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protein over-expressed in different clones showed differences in molecular mass. 
Sequencing of two of the clones revealed that in both cases integration of an E. coli IS 
sequence had occurred within the sequence coding for the C terminal region of 730. The two 
integration events have produced in-frame fusion with 1 additional codon in the case of CI, 
5 and 12 additional codons in the case of C2 (Figure 11). The resulting "mutant" forms of 730 
have the following sequences: 

730-Cl (due to an ISX insertion - figure 11A) 

1 MADLAQDPFI TDNAQRQHYE PGGKYHLFGD PRGSVSDRTG KINVIQDYTH 

51 QMGNLLIQQA NINGTIGYHT RFSGHGHEEH APFDNHAADS ASEEKGNVDE 

101 GFTVYRLNWE GHEHHPADAY DGPKGGNYPK PTGARDEYTY HVNGTARSIK 

151 LNPTDTRSIR QRISDNYSNL GSNFSDRADE ANRKMFEHNA KLDRWGNSME 

201 FINGVAAGAL NPFISAGEAL GIGDILYGTR YAIDKAAMRN IAPLPAEGKF 

251 AVIGGLGSVA GFEKNTREAV DRWIQENPNA AETVKAVFNV AAAAKVAKLA 

301 KAAKPGKAAV SGDFADSYKK KLALSDSARQ LYQNAKYREA LDIHYEDLIR 

351 RKTDGSSKFI NGREIDAVTN DALIQAR* 

The additional amino acid produced by the insertion is underlined. 

730-C2 (due to an ZS5 insertion - Figure 11B) 

1 MADLAQDPFI TDNAQRQHYE PGGKYHLFGD PRGSVSDRTG KINVIQDYTH 
51 QMGNLLIQQA NINGTIGYHT RFSGHGHEEH APFDNHAADS ASEEKGNVDE 
101 GFTVYRLNWE GHEHHPADAY DGPKGGNYPK PTGARDEYTY HVNGTARSIK 
151 LNPTDTRSIR QRISDNYSNL GSNFSDRADE ANRKMFEHNA KLDRWGNSME 
201 FINGVAAGAL NPFISAGEAL GIGDILYGTR YAIDKAAMRN IAPLPAEGKF 
251 AVIGGLGSVA GFEKNTREAV DRWIQENPNA AETVEAVFNV AAAAKVAKLA 
301 KAAKPGKAAV SGDFADSYKK KLALSDSARQ LYQNAKYREA LGKVRISGEI 
351 LLG * 

The additional amino acids produced by the insertion are underlined. 

In conclusion, intracellular expression of the 730-Cl form gives very high level of protein 
and has no toxic effect on the host cells, whereas the presence of the native C-terminus is 
toxic. These data suggest that the "intracellular toxicity" of 730 is associated with the 
C-terminal 65 amino acids of the protein. 

Equivalent truncation of ORF29 to the first 231 or 368 amino acids has been performed, 
using expression with or without the leader peptide (amino acids 1-26; deletion gives 
cytoplasmic expression) and with or without a His-tag. 



Example 22 - domains in 961 

As described in example 9 above, the GST-fusion of 961 was the best-expressed in E.coll 
To improve expression, the protein was divided into domains (figure 12). 

The domains of 961 were designed on the basis of YadA (an adhesin produced by Yersinia 
40 which has been demonstrated to be an adhesin localized on the bacterial surface that forms 
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oligomers that generate surface projection [Hoiczyk et al. (2000) EMBO J 19:5989-99]) and 
are: leader peptide, head domain, coiled-coil region (stalk), and membrane anchor domain. 

These domains were expressed with or without the leader peptide, and optionally fused 
either to C-terminal His-tag or to N-terminal GST. E.coli clones expressing different 
5 domains of 961 were analyzed by SDS-PAGE and western blot for the production and 
localization of the expressed protein, from over-night (o/n) culture or after 3 hours induction 
with IPTG. The results were: 





Total lysate 


Periplasm 


Supernatant 


OMV 




(Western Blot) 


(Western Blot) 


(Western Blot) 


SDS-PAGE 


961 (o/n) 










961 (IPTG) 


+/- 








961-L (o/n) 


+ 






+ 


961-L (IPTG) 


+ 






+ 


961c-L(o/n) 










961c-L (IPTG) 


+ 


+ 


+ 




961Ai-L (o/n) 










961A r L (IPTG) 


+ 






+ 



The results show that in E.coli: 



■ 961-L is highly expressed and localized on the outer membrane. By western blot analysis 
10 two specific bands have been detected: one at ~45kDa (the predicted molecular weight) and 

one at ~180kDa, indicating that 961-L can form oligomers. Additionally, these aggregates 
are more expressed in the over-night culture (without IPTG induction). OMV preparations of 
this clone were used to immunize mice and serum was obtained. Using overnight culture 
(predominantly by oligomeric form) the serum was bactericidal; the IPTG-induced culture 
15 (predominantly monomelic) was not bactericidal. 

■ 961Ai-L (with a partial deletion in the anchor region) is highly expressed and localized 
on the outer membrane, but does not form oligomers; 

■ the 961c-L (without the anchor region) is produced in soluble form and exported in the 
supernatant. 

20 Titres in ELIS A and in the serum bactericidal assay using His-fusions were as follows: 





ELISA 


Bactericidal 


961a (aa 24-268) 


24397 


4096 
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961b (aa 269-405) 


7763 


64 


961c-L 


29770 


8192 


961c (2996) 


30774 


>65536 


961c (MC58) 


33437 


16384 


961d 


26069 


>65536 



Kcoli clones expressing different forms of 961 (961, 961-L, 961A r L and 961c-L) were used 
to investigate if the 961 is an adhesin (c.f. YadA). An adhesion assay was performed using 
(a) the human epithelial cells and (b) Kcoli clones after either over-night culture or three 
hours IPTG induction. 961-L grown over-night (961Ai-L) and IPTG-induced 961c-L (the 
clones expressing protein on surface) adhere to human epithelial cells. 

961c was also used in hybrid proteins (see above). As 961 and its domain variants direct 
efficient expression, they are ideally suited as the N-terminal portion of a hybrid protein. 



Example 23 - further hybrids 

Further hybrid proteins of the invention are shown below (see also Figure 14). These are 
advantageous when compared to the individual proteins: 

ORF46. 1-741 

1 ATGTCAGATT TGGCAAACGA TTCTTTTATC CGGCAGGTTC TCGACCGTCA 

51 GCATTTCGAA CCCGACGGGA AATACCACCT ATTCGGCAGC AGGGGGGAAC 

101 TTGCCGAGCG CAGCGGCCAT ATCGGATTGG GAAAAATACA AAGCCATCAG 

151 TTGGGCAACC TGATGATTCA ACAGGCGGCC ATTAAAGGAA ATATCGGCTA 

201 CATTGTCCGC TTTTCCGATC ACGGGCACGA AGTCCATTCC CCCTTCGACA 

251 ACCATGCCTC ACATTCCGAT TCTGATGAAG CCGGTAGTCC CGTTGACGGA 

301 TTTAGCCTTT ACCGCATCCA TTGGGACGGA TACGAACACC ATCCCGCCGA 

351 CGGCTATGAC GGGCCACAGG GCGGCGGCTA TCCCGCTCCC AAAGGCGCGA 

401 GGGATATATA CAGCTACGAC ATAAAAGGCG TTGCCCAAAA TATCCGCCTC 

451 AACCTGACCG ACAACCGCAG CACCGGACAA CGGCTTGCCG ACCGTTTCCA 

501 CAATGCCGGT AGTATGCTGA CGCAAGGAGT AGGCGACGGA TTCAAACGCG 

551 CCACCCGATA CAGCCCCGAG CTGGACAGAT CGGGCAATGC CGCCGAAGCC 

601 TTCAACGGCA CTGCAGATAT CGTTAAAAAC ATCATCGGCG CGGCAGGAGA 

651 AATTGTCGGC GCAGGCGATG CCGTGCAGGG CATAAGCGAA GGCTCAAACA 

701 TTGCTGTCAT GCACGGCTTG GGTCTGCTTT CCACCGAAAA CAAGATGGCG 

751 CGCATCAACG ATTTGGCAGA TATGGCGCAA CTCAAAGACT ATGCCGCAGC 

801 AGCCATCCGC GATTGGGCAG TCCAAAACCC CAATGCCGCA CAAGGCATAG 

851 AAGCCGTCAG CAATATCTTT ATGGCAGCCA TCCCCATCAA AGGGATTGGA 

901 GCTGTTCGGG GAAAATACGG CTTGGGCGGC ATCACGGCAC ATCCTATCAA 

951 GCGGTCGCAG ATGGGCGCGA TCGCATTGCC GAAAGGGAAA TCCGCCGTCA 

1001 GCGACAATTT TGCCGATGCG GCATACGCCA AATACCCGTC CCCTTACCAT 

1051 TCCCGAAATA TCCGTTCAAA CTTGGAGCAG CGTTACGGCA AAGAAAACAT 

1101 CACCTCCTCA ACCGTGCCGC CGTCAAACGG CAAAAATGTC AAACTGGCAG 

1151 ACCAACGCCA CCCGAAGACA GGCGTACCGT TTGACGGTAA AGGGTTTCCG 

1201 AATTTTGAGA AGCACGTGAA ATATGATACG GGATCCGGAG GGGGTGGTGT 

1251 CGCCGCCGAC ATCGGTGCGG GGCTTGCCGA TGCACTAACC GCACCGCTCG 

1301 ACCATAAAGA CAAAGGTTTG CAGTCTTTGA CGCTGGATCA GTCCGTCAGG 

1351 AAAAACGAGA AACTGAAGCT GGCGGCACAA GGTGCGGAAA AAACTTATGG 

1401 AAACGGTGAC AGCCTCAATA CGGGCAAATT GAAGAACGAC AAGGTCAGCC 

1451 GTTTCGACTT TATCCGCCAA ATCGAAGTGG ACGGGCAGCT CATTACCTTG 

1501 GAGAGTGGAG AGTTCCAAGT ATACAAACAA AGCCATTCCG CCTTAACCGC 

1551 CTTTCAGACC GAGCAAATAC AAGATTCGGA GCATTCCGGG AAGATGGTTG 

1601 CGAAACGCCA GTTCAGAATC GGCGACATAG CGGGCGAACA TACATCTTTT 
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1651 
1701 
1751 
1801 
1851 
1901 
1951 
2001 

1 
51 
101 
151 
201 
251 
301 
351 
401 
451 
501 
551 
601 
651 



GACAAGCTTC 
TTCAGACGAT 
AGCAGGGAAA 
GACCTGGCCG 
CAGCGGTTCC 
GTATCTTTGG 
ACCGTAAACG 
CCACCACCAC 

MSDLANDSFI 
LGNLMIQQAA 
FSLYRIHWDG 
NLTDNRSTGQ 
FNGTADIVKN 
RINDLADMAQ 
AVRGKYGLGG 
SRNIRSNLEQ 
NFEKHVKYDT 
KNEKLKLAAQ 
ESGEFQVYKQ 
DKLPEGGRAT 
DLAAADIKPD 
TVNGIRHIGL 



CCGAAGGCGG 
GCCGGCGGAA 
CGGCAAAATC 
CCGCCGATAT 
GTCCTTTACA 
CGGAAAAGCC 
GCATACGCCA 
CACCACTGA 

RQVLDRQHFE 
IKGNIGYIVR 
YEHHPADGYD 
RLADRFHNAG 
IIGAAGEIVG 
LKDYAAAAIR 
ITAHPIKRSQ 
RYGKENITSS 
GSGGGGVAAD 
GAEKTYGNGD 
SHSALTAFQT 
YRGTAFGSDD 
GKRHAVTSGS 
AAKQLEHHHH 



ORF46. 1-961 



1 

51 
101 
151 
201 
251 
301 
351 
401 
451 
501 
551 
601 
651 
701 
751 
801 
851 
901 
951 
1001 
1051 
1101 
1151 
1201 
1251 
1301 
1351 
1401 
1451 
1501 
1551 
1601 
1651 
1701 
1751 
1801 
1851 
1901 
1951 
2001 



ATGTCAGATT 
GCATTTCGAA 
TTGCCGAGCG 
TTGGGCAACC 
CATTGTCCGC 
ACCATGCCTC 
TTTAGCCTTT 
CGGCTATGAC 
GGGATATATA 
AACCTGACCG 
CAATGCCGGT 
CCACCCGATA 
TTCAACGGCA 
AATTGTCGGC 
TTGCTGTCAT 
CGCATCAACG 
AGCCATCCGC 
AAGCCGTCAG 
GCTGTTCGGG 
GCGGTCGCAG 
GCGACAATTT 
TCCCGAAATA 
CACCTCCTCA 
ACCAACGCCA 
AATTTTGAGA 
CACAAACGAC 
CCTACAACAA 
TACGACATTG 
CGATGTTGAA 
GTAACCTGAC 
GTAAAAGCTG 
CACTGATGCC 
ACGCCTTGAA 
AAGACAAATA 
CGTCGACAAG 
AAACCAACAC 
CAGACGGCCG 
AGAAACTGCA 
CAGCCGACAA 
GATATCGCTA 
CGTGTACACC 



TGGCAAACGA 
CCCGACGGGA 
CAGCGGCCAT 
TGATGATTCA 
TTTTCCGATC 
ACATTCCGAT 
ACCGCATCCA 
GGGCCACAGG 
CAGCTACGAC 
ACAACCGCAG 
AGTATGCTGA 
CAGCCCCGAG 
CTGCAGATAT 
GCAGGCGATG 
GCACGGCTTG 
ATTTGGCAGA 
GATTGGGCAG 
CAATATCTTT 
GAAAATACGG 
ATGGGCGCGA 
TGCCGATGCG 
TCCGTTCAAA 
ACCGTGCCGC 
CCCGAAGACA 
AGCACGTGAA 
GACGATGTTA 
TGGCCAAGAA 
ATGAAGACGG 
GCCGACGACT 
CAAAACCGTC 
CAGAATCTGA 
GCTTTAGCAG 
TAAATTGGGA 
TCGTAAAAAT 
CATGCCGAAG 
TAAGGCAGAC 
AAGAAACCAA 
GCAGGCAAAG 
GGCCGAAGCT 
CGAACAAAGA 
AGAGAAGAGT 



CAGGGCGACA 
AACTGACCTA 
GAACATTTGA 
CAAGCCGGAT 
ACCAAGCCGA 
CAGGAAGTTG 
TATCGGCCTT 



PDGKYHLFGS 
FSDHGHEVHS 
GPQGGGYPAP 
SMLTQGVGDG 
AGDAVQGISE 
DWAVQNPNAA 
MGAIALPKGK 
TVPPSNGKNV 
XGAGLADALT 
SLNTGKLKND 
HQIQDSEHSG 
AGGKLTYTID 
VLYNQAEKGS 
HH* 



TTCTTTTATC 
AATACCACCT 
ATCGGATTGG 
ACAGGCGGCC 
ACGGGCACGA 
TCTGATGAAG 
TTGGGACGGA 
GCGGCGGCTA 
ATAAAAGGCG 
CACCGGACAA 
CGCAAGGAGT 
CTGGACAGAT 
CGTTAAAAAC 
CCGTGCAGGG 
GGTCTGCTTT 
TATGGCGCAA 
TCCAAAACCC 
ATGGCAGCCA 
CTTGGGCGGC 
TCGCATTGCC 
GCATACGCCA 
CTTGGAGCAG 
CGTCAAACGG 
GGCGTACCGT 
ATATGATACG 
AAAAAGCTGC 
ATCAACGGTT 
CACAATTACC 
TTAAAGGTCT 
AATGAAAACA 
AATAGAAAAG 
ATACTGATGC 
GAAAATATAA 
TGATGAAAAA 
CATTCAACGA 
GAAGCCGTCA 
ACAAAACGTC 
CCGAAGCTGC 
GTCGCTGCAA 
TAATATTGCT 
CTGACAGCAA 



TATCGCGGGA 
CACCATAGAT 
AATCGCCAGA 
GGAAAACGCC 
GAAAGGCAGT 
CCGGCAGCGC 
GCCGCCAAGC 



RGELAERSGH 
PFDNHASHSD 
KGARDIYSYD 
FKRATRYSPE 
GSNXAVMHGL 
QGIEAVSNIF 
SAVSDNFADA 
KLADQRHPKT 
APLDHKDKGL 
KVSRFDFIRQ 
KMVAKRQFRI 
FAAKQGNGKI 
YSLGIFGGKA 



CGGCGTTCGG 
TTCGCCGCCA 
ACTCAATGTC 
ATGCCGTCAT 
TACTCCCTCG 
GGAAGTGAAA 
AACTCGAGCA 



IGLGKIQSHQ 
SDEAGSPVDG 
IKGVAQNIRL 
LDRSGNAAEA 
GLLST2NKMA 
MAAIPIKGIG 
AYAKYPSPYH 
GVPFDGKGFP 
QSLTLDQSVR 
IEVDGQLITL 
GDIAGEHTSF 
EHLKSPELNV 
QEVAGSAEVK 



CGGCAGGTTC 
ATTCGGCAGC 
GAAAAATACA 
ATTAAAGGAA 
AGTCCATTCC 
CCGGTAGTCC 
TACGAACACC 
TCCCGCTCCC 
TTGCCCAAAA 
CGGCTTGCCG 
AGGCGACGGA 
CGGGCAATGC 
ATCATCGGCG 
CATAAGCGAA 
CCACCGAAAA 
CTCAAAGACT 
CAATGCCGCA 
TCCCCATCAA 
ATCACGGCAC 
GAAAGGGAAA 
AATACCCGTC 
CGTTACGGCA 
CAAAAATGTC 
TTGACGGTAA 
GGATCCGGAG 
CACTGTGGCC 
TCAAAGCTGG 
AAAAAAGACG 
GGGTCTGAAA 
AACAAAACGT 
TTAACAACCA 
CGCTCTGGAT 
CGACATTTGC 
TTAGAAGCCG 
TATCGCCGAT 
AAACCGCCAA 
GATGCCAAAG 
CGCTGGCACA 
AAGTTACCGA 
AAAAAAGCAA 
ATTTGTCAGA 



TCGACCGTCA 
AGGGGGGAAC 
AAGCCATCAG 
ATATCGGCTA 
CCCTTCGACA 
CGTTGACGGA 
ATCCCGCCGA 
AAAGGCGCGA 
TATCCGCCTC 
ACCGTTTCCA 
TTCAAACGCG 
CGCCGAAGCC 
CGGCAGGAGA 
GGCTCAAACA 
CAAGATGGCG 
ATGCCGCAGC 
CAAGGCATAG 
AGGGATTGGA 
ATCCTATCAA 
TCCGCCGTCA 
CCCTTACCAT 
AAGAAAACAT 
AAACTGGCAG 
AGGGTTTCCG 
GAGGAGGAGC 
ATTGCTGCTG 
AGAGACCATC 
CAACTGCAGC 
AAAGTCGTGA 
CGATGCCAAA 
AGTTAGCAGA 
GCAACCACCA 
TGAAGAGACT 
TGGCTGATAC 
TCATTGGATG 
TGAAGCCAAA 
TAAAAGCTGC 
GCTAATACTG 
CATCAAAGCT 
ACAGTGCCGA 
ATTGATGGTC 
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2051 
2101 
2151 
2201 
2251 
2301 
2351 
2401 

1 
51 
101 
151 
201 
251 
301 
351 
401 
451 
501 
551 
601 
651 
701 
751 
801 



TGAACGCTAC 
TCCATTGCCG 
AGACCTGCGC 
CCGG TCTGTT 
GTCGGCGGCT 
CTTTACCGAA 
CCGGTTCTTC 
CACCACCACC 

MSDLANDSFI 
LGNLMIQQAA 
FSLYRIHWDG 
NLTDNRSTGQ 
FNGTADIVKN 
RINDLADMAQ 
AVRGKYGLGG 
SRNIRSNLEQ 
NFEKHVKYDT 
YDIDEDGTIT 
VKAAESEIEK 
KTNIVKIDEK 
QTAEETKQNV 
DIATNKDNIA 
SIADHDTRLN 
VGGYKSESAV 
HHHHHH* 



TACCGAAAAA 
ATCACGATAC 
AAAGAAACCC 
CCAACCTTAC 
ACAAATCCGA 
AACTTTGCCG 
CGCAGCCTAC 
ACCACCACTG 

RQVLDRQHFE 
IKGNIGYIVR 
YEHHPADGYD 
RLADRFHNAG 
IIGAAGEIVG 
LKDYAAAAIR 
ITAHPIKRSQ 
RYGKENITSS 
GSGGGGATND 
KKDATAADVE 
LTTKLiADTDA 
LEAVADTVDK 
DAKVKAAETA 
KKANSADVYT 
GLDKTVSDIiR 
AIGTGFRFTE 



TTGGACACAC 
TCGCCTGAAC 
GCCAAGGCCT 
AACGTGGGTC 
ATCGGCAGTC 
CCAAAGCAGG 
CATGTCGGCG 
A 

PDGKYHLFGS 
FSDHGHEVHS 
GPQGGGYPAP 
SMLTQGVGDG 
AGDAVQGISE 
DWAVQNPNAA 
MGAIALPKGK 
TVPPSNGKNV 
DDVKKAATVA 
ADDFKGLGLK 
ALADTDAALD 
HAEAFNDIAD 
AGKAEAAAGT 
REESDSKFVR 
KETRQGLAEQ 
NFAAKAGVAV 



GCTTGGCTTC 
GGTTTGGATA 
TGCAGAACAA 
GGTTCAATGT 
GCCATCGGTA 
CGTGGCAGTC 
TCAATTACGA 



RGELAERSGH 
PFDNHASHSD 
KGARDIYSYD 
FKRATRYSPE 
GSNIAVMHGL 
QGIEAVSNIF 
SAVSDNFADA 
KLADQRHPKT 
IAAAYNNGQE 
KWTNLTKTV 
ATTNALNKLG 
SLDETNTKAD 
ANTAADKAEA 
IDGLNATTEK 
AALSGLFQPY 
GTSSGSSAAY 



TGCTGAAAAA 
AAACAGTGTC 
GCCGCGCTCT 
AACGGCTGCA 
CCGGCTTCCG 
GGCACTTCGT 
GTGGCTCGAG 



IGLGKIQSHQ 
SDEAGSPVDG 
IKGVAQNIRL 
LDRSGNAAEA 
GLLSTENKMA 
MAAIPIKGIG 
AYAKYPSPYH 
GVPFDGKGFP 
INGFKAGETI 
NENKQNVDAK 
ENITTFAEET 
EAVKTANEAK 
VAAKVTDIKA 
LDTRLASAEK 
NVGRFNVTAA 
HVGVNYEWLE 
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ORT4 6,1- 961c 



1 
51 
101 
151 
201 
251 
301 
351 
401 
451 
501 
551 
601 
651 
701 
751 
801 
851 
901 
951 
1001 
1051 
1101 
1151 
1201 
1251 
1301 
1351 
1401 
1451 
1501 
1551 
1601 
1651 
1701 
1751 
1801 
1851 



ATGTCAGATT 
GCATTTCGAA 
TTGCCGAGCG 
TTGGGCAACC 
CATTGTCCGC 
ACCATGCCTC 
TTTAGCCTTT 
CGGCTATGAC 
GGGATATATA 
AACCTGACCG 
CAATGCCGGT 
CCACCCGATA 
TTCAACGGCA 
AATTGTCGGC 
TTGCTGTCAT 
CGCATCAACG 
AGCCATCCGC 
AAGCCGTCAG 
GCTGTTCGGG 
GCGGTCGCAG 
GCGACAATTT 
TCCCGAAATA 
CACCTCCTCA 
ACCAACGCCA 
AATTTTGAGA 
CACAAACGAC 
CCTACAACAA 
TACGACATTG 
CGATGTTGAA 
CTAACCTGAC 
GTAAAAGCTG 
CACTGATGCC 
ACGCCTTGAA 
AAGACAAATA 
CGTCGACAAG 
AAACCAACAC 
CAGACGGCCG 
AGAAACTGCA 



TGGCAAACGA 

CCCGACGGGA 

CAGCGGCCAT 

TGATGATTCA 

TTTTCCGATC 

ACATTCCGAT 

ACCGCATCCA 

GGGCCACAGG 

CAGCTACGAC 

ACAACCGCAG 

AGTATGCTGA 

CAGCCCCGAG' 

CTGCAGATAT 

GCAGGCGATG 

GCACGGCTTG 

ATTTGGCAGA 

GATTGGGCAG 

CAATATCTTT 

GAAAATACGG 

ATGGGCGCGA 

TGCCGATGCG 

TCCGTTCAAA 

ACCGTGCCGC 

CCCGAAGACA 

AGCACGTGAA 

GACGATGTTA 

TGGCCAAGAA 

ATGAAGACGG 

GCCGACGACT 

CAAAACCGTC 

CAGAATCTGA 

GCTTTAGCAG 

TAAATTGGGA 

TCGTAAAAAT 

CATGCCGAAG 

TAAGGCAGAC 

AAGAAACCAA 

GCAGGCAAAG 



TTCTTTTATC 

AATACCACCT 

ATCGGATTGG 

ACAGGCGGCC 

ACGGGCACGA 

TCTGATGAAG 

TTGGGACGGA 

GCGGCGGCTA 

ATAAAAGGCG 

CACCGGACAA 

CGCAAGGAGT 

CTGGACAGAT 

CGTTAAAAAC 

CCGTGCAGGG 

GGTCTGCTTT 

TATGGCGCAA 

TCCAAAACCC 

ATGGCAGCCA 

CTTGGGCGGC 

TCGCATTGCC 

GCATACGCCA 

CTTGGAGCAG 

CGTCAAACGG 

GGCGTACCGT 

ATATGATACG 

AAAAAGCTGC 

ATCAACGGTT 

CACAATTACC 

TTAAAGGTCT 

AATGAAAACA 

AATAGAAAAG 

ATACTGATGC 

GAAAATATAA 

TGATGAAAAA 

CATTCAACGA 

GAAGCCGTCA 

ACAAAACGTC 

CCGAAGCTGC 



CGGCAGGTTC 

ATTCGGCAGC 

GAAAAATACA 

ATTAAAGGAA 

AGTCCATTCC 

CCGGTAGTCC 

TACGAACACC 

TCCCGCTCCC 

TTGCCCAAAA 

CGGCTTGCCG 

AGGCGACGGA 

CGGGCAATGC 

ATCATCGGCG 

CATAAGCGAA 

CCACCGAAAA 

CTCAAAGACT 

CAATGCCGCA 

TCCCCATCAA 

ATCACGGCAC 

GAAAGGGAAA 

AATACCCGTC 

CGTTACGGCA 

CAAAAATGTC 

TTGACGGTAA 

GGATCCGGAG 

CACTGTGGCC 

TCAAAGCTGG 

AAAAAAGACG 

GGGTCTGAAA 

AACAAAACGT 

TTAACAACCA 

CGCTCTGGAT 

CGACATTTGC 

TTAGAAGCCG 

TATCGCCGAT 

AAACCGCCAA 

GATGCCAAAG 

CGCTGGCACA 



TCGACCGTCA 

AGGGGGGAAC 

AAGCCATCAG 

ATATCGGCTA 

CCCTTCGACA 

CGTTGACGGA 

ATCCCGCCGA 

AAAGGCGCGA 

TATCCGCCTC 

ACCGTTTCCA 

TTCAAACGCG 

CGCCGAAGCC 

CGGCAGGAGA 

GGCTCAAACA 

CAAGATGGCG 

ATGCCGCAGC 

CAAGGCATAG 

AGGGATTGGA 

ATCCTATCAA 

TCCGCCGTCA 

CCCTTACCAT 

AAGAAAACAT 

AAACTGGCAG 

AGGGTTTCCG 

GAGGAGGAGC 

ATTGCTGCTG 

AGAGACCATC 

CAACTGCAGC 

AAAGTCGTGA 

CGATGCCAAA 

AGTTAGCAGA 

GCAACCACCA 

TGAAGAGACT 

TGGCTGATAC 

TCATTGGATG 

TGAAGCCAAA 

TAAAAGCTGC 

GCTAATACTG 
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■ 



10 



15 



20 



25 



1901 
1953 
2003 
2053 
2103 
2153 
2203 
2251 



51 
103 
153 
201 
251 
301 
351 
401 
453 
503 
553 
603 
653 
703 
751 



CAGCCGACAA 
GATATCGCTA 
CGTGTACACC 
TGAACGCTAC 
TCCATTGCCG 
AGACCTGCGC 
CCGGTCTGTT 
CACTGA 

MSDLANDSFI 
LGNLMIQQAA 
FSLYRIHWDG 
NLTDNRSTGQ 
FNGTADIVKN 
RINDLADMAQ 
AVRGKYGLGG 
SRNIRSNLEQ 
NFEKHVKYDT 
YDIDEDGTIT 
VKAAESEIEK 
KTNIVKIDEK 
QTAEETKQNV 
DIATNKDNIA 
SIADHDTRLN 
H* 



GGCCGAAGCT 
CGAACAAAGA 
AGAGAAGAGT 
TACCGAAAAA 
ATCACGATAC 
AAAGAAACCC 
CCAACCTTAC 



RQVLDRQHFE 
IKGNIGYIVR 
YEHHPADGYD 
RLADRFHNAG 
IIGAAGEIVG 
LKDYAAAAIR 
ITAHPIKRSQ 
RYGKENITSS 
GSGGGGATND 
KKDATAADVE 
LTTKLADTDA 
LEAVADTVDK 
DAKVKAAETA 
KKANSADVYT 
GLDKTVSDLR 



GTCGCTGCAA 
TAATATTGCT 
CTGACAGCAA 
TTGGACACAC 
TCGCCTGAAC 
GCCAAGGCCT 
AACGTGGGTC 



PDGKYHLFGS 
FSDHGHEVHS 
GPQGGGYPAP 
SMLTQGVGDG 
AGDAVQGISE 
DWAVQNPNAA 
MGAIALPKGK 
TVPPSNGKNV 
DDVKKAATVA 
ADDFKGLGLK 
ALADTDAALD 
HAEAFNDIAD 
AGKAEAAAGT 
REESDSKFVR 
KETRQGLAEQ 



AAGTTACCGA 
AAAAAAGCAA 
ATTTGTCAGA 
GCTTGGCTTC 
GGTTTGGATA 
TGCAGAACAA 
TCGAGCACCA 



RGELAERSGH 
PFDNHASHSD 
KGARDIYSYD 
FKRATRYSPE 
GSNIAVMHGL 
QGIEAVSNIF 
SAVSDNFADA 
KLADQRHPKT 
IAAAYNNGQE 
KWTNLTKTV 
ATTNALNKLG 
SLDETNTKAD 
ANTAADKAKA 
IDGLNATTEK 
AALSGLFQPY 



CATCAAAGCT 
ACAGTGCCGA 
ATTGATGGTC 
TGCTGAAAAA 
AAACAGTGTC 
GCCGCGCTCT 
CCACCACCAC 



IGLGKIQSHQ 
SDEAGSPVDG 
IKGVAQNIRL 
LDRSGWAAEA 
GLLSTENKMA 
MAAIPIKGIG 
AYAKYPSPYH 
GVPFDGKGFP 
INGFKAGETI 
NENKQNVDAK 
ENITTFAEET 
EAVKTANEAK 
VAAKVTDIKA 
LDTRLASAEK 
NVGLEHHHHH 
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961-ORF46.1 

1 ATGGCCACAA 

51 TGCTGCCTAC 

101 CCATCTACGA 

151 GCAGCCGATG 

201 CGTGACTAAC 

251 CCAAAGTAAA 

301 GCAGACACTG 

351 CACCAACGCC 

401 AGACTAAGAC 

451 GATACCGTCG 

501 GGATGAAACC 

551 CCAAACAGAC 

601 GCTGCAGAAA 

651 TACTGCAGCC 

701 AAGCTGATAT 

751 GCCGACGTGT 

801 TGGTCTGAAC 

851 AAAAATCCAT 

901 GTGTCAGACC 

951 GCTCTCCGGT 

1001 CTGCAGTCGG 

1051 TTCCGCTTTA 

1101 TTCGTCCGGT 

1151 GATCCGGAGG 

1201 GTTCTCGACC 

1251 CAGCAGGGGG 

1301 TACAAAGCCA 

1351 GGAAATATCG 

1401 TTCCCCCTTC 

1451 GTCCCGTTGA 

1501 CACCATCCCG 

1551 TCCCAAAGGC 

1601 AAAATATCCG 

1651 GCCGACCGTT 

1701 CGGATTCAAA 

1751 ATGCCGCCGA 

1801 GGCGCGGCAG 

1851 CGAAGGCTCA 

1901 AAAACAAGAT 



ACGACGACGA 
AACAATGGCC 
CATTGATGAA 
TTGAAGCCGA 
CTGACCAAAA 
AGCTGCAGAA 
ATGCCGCTTT 
TTGAATAAAT 
AAATATCGTA 
ACAAGCATGC 
AACACTAAGG 
GGCCGAAGAA 
CTGCAGCAGG 
GACAAGGCCG 
CGCTACGAAC 
ACACCAGAGA 
GCTACTACCG 
TGCCGATCAC 
TGCGCAAAGA 
CTGTTCCAAC 
CGGCTACAAA 
CCGAAAACTT 
TCTTCCGCAG 
AGGAGGATCA 
GTCAGCATTT 
GAACTTGCCG 
TCAGTTGGGC 
GCTACATTGT 
GACAACCATG 
CGGATTTAGC 
CCGACGGCTA 
GCGAGGGATA 
CCTCAACCTG 
TCCACAATGC 
CGCGCCACCC 
AGCCTTCAAC 
GAGAAATTGT 
AACATTGCTG 
GGCGCGCATC 



TGTTAAAAAA 
AAGAAATCAA 
GACGGCACAA 
CGACTTTAAA 
CCGTCAATGA 
TCTGAAATAG 
AGCAGATACT 
TGGGAGAAAA 
AAAATTGATG 
CGAAGCATTC 
CAGACGAAGC 
ACCAAACAAA 
CAAAGCCGAA 
AAGCTGTCGC 
AAAGATAATA 
AGAGTCTGAC 
AAAAATTGGA 
GATACTCGCC 
AACCCGCCAA 
CTTACAACGT 
TCCGAATCGG 
TGCCGCCAAA 
CCTACCATGT 
GATTTGGCAA 
CGAACCCGAC 
AGCGCAGCGG 
AACCTGATGA 
CCGCTTTTCC 
CCTCACATTC 
CTTTACCGCA 
TGACGGGCCA 
TATACAGCTA 
ACCGACAACC 
CGGTAGTATG 
GATACAGCCC 
GGCACTGCAG 
CGGCGCAGGC 
TCATGCACGG 
AACGATTTGG 



GCTCCCACTG 
CGGTTTCAAA 
TTACCAAAAA 
GGTCTGGGTC 
AAACAAACAA 
AAAAGTTAAC 
GATGCCGCTC 
TATAACGACA 
AAAAATTAGA 
AACGATATCG 
CGTCAAAACC 
ACGTCGATGC 
GCTGCCGCTG 
TGCAAAAGTT 
TTGCTAAAAA 
AGCAAATTTG 
CACACGCTTG 
TGAACGGTTT 
GGCCTTGCAG 
GGGTCGGTTC 
CAGTCGCCAT 
GCAGGCGTGG 
CGGCGTCAAT 
ACGATTCTTT 
GGGAAATACC 
CCATATCGGA 
TTCAACAGGC 
GATCACGGGC 
CGATTCTGAT 
TCCATTGGGA 
CAGGGCGGCG 
CGACATAAAA 
GCAGCACCGG 
CTGACGCAAG 
CGAGCTGGAC 
ATATCGTTAA 
GATGCCGTGC 
CTTGGGTCTG 
CAGATATGGC 



TGGCCATTGC 
GCTGGAGAGA 
AGACGCAACT 
TGAAAAAAGT 
AACGTCGATG 
AACCAAGTTA 
TGGATGCAAC 
TTTGCTGAAG 
AGCCGTGGCT 
CCGATTCATT 
GCCAATGAAG 
CAAAGTAAAA 
GCACAGCTAA 
ACCGACATCA 
AGCAAACAGT 
TCAGAATTGA 
GCTTCTGCTG 
GGATAAAACA 
AACAAGCCGC 
AATGTAACGG 
CGGTACCGGC 
CAGTCGGCAC 
TACGAGTGGG 
TATCCGGCAG 
ACCTATTCGG 
TTGGGAAAAA 
GGCCATTAAA 
ACGAAGTCCA 
GAAGCCGGTA 
CGGATACGAA 
GCTATCCCGC 
GGCGTTGCCC 
ACAACGGCTT 
GAGTAGGCGA 
AGATCGGGCA 
AAACATCATC 
AGGGCATAAG 
CTTTCCACCG 
GCAACTCAAA 
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1951 GACTATGCCG CAGCAGCCAT CCGCGATTGG GCAGTCCAAA ACCCCAATGC 

2001 CGCACAAGGC ATAGAAGCCG TCAGCAATAT CTTTATGGCA GCCATCCCCA 

2051 TCAAAGGGAT TGGAGCTGTT CGGGGAAAAT ACGGCTTGGG CGGCATCACG 

2101 GCACATCCTA TCAAGCGGTC GCAGATGGGC GCGATCGCAT TGCCGAAAGG 

5 2151 GAAATCCGCC GTCAGCGACA ATTTTGCCGA TGCGGCATAC GCCAAATACC 

2201 CGTCCCCTTA CCATTCCCGA AATATCCGTT CAAACTTGGA GCAGCGTTAC 

2251 GGCAAAGAAA ACATCACCTC CTCAACCGTG CCGCCGTCAA ACGGCAAAAA 

2301 TGTCAAACTG GCAGACCAAC GCCACCCGAA GACAGGCGTA CCGTTTGACG 

2351 GTAAAGGGTT TCCGAATTTT GAGAAGCACG TGAAATATGA TACGCTCGAG 

10 2401 CACCACCACC ACCACCACTG A 



1 MATNDDDVKK AATVAIAAAY NNGQEINGFK AGETIYDIDE DGTITKKDAT 

51 AADVEADDFK GLGLKKWTN LTKTVNENKQ NVDAKVKAAE SEIEKLTTKL 

101 ADTDAALADT DAALDATTNA LNKLGENITT FAEETKTNIV KIDEKLEAVA 

15 151 DTVDKHAEAF NDIADSLDET NTKADEAVKT ANEAKQTAEE TKQNVDAKVK 

201 AAETAAGKAE AAAGTANTAA DKAEAVAAKV TDIKADIATN KDNIAKKANS 

251 ADVYTREESD SKFVRIDGLN ATTEKLDTRL ASAEKSIADH DTRLNGLDKT 

301 VSDLRKETRQ GLAEQAALSG LFQPYNVGRF NVTAAVGGYK SESAVAIGTG 

351 FRFTENFAAK AGVAVGTS SG SSAAYHVGVN YEWGSGGGGS DLANDSFIRQ 

20 401 VLDRQHFEPD GKYHLFGSRG ELAERSGHIG LGKIQSHQLG NLMIQQAAIK 

451 GNIGYIVRFS DHGHEVHSPF DNHASHSDSD EAGSPVDGFS LYRIHWDGYE 

501 HHPADGYDGP QGGGYPAPKG ARDIYSYDIK GVAQNIRLNL TDNRSTGQRL 

551 ADRFHNAGSM LTQGVGDGFK RATRYSPELD RSGNAAEAFN GTADIVKNII 

601 GAAGEIVGAG DAVQGISEGS NIAVMHGLGL LSTENKMARI NDLADMAQLK 

25 651 DYAAAAIRDW AVQNPNAAQG IEAVSNIFMA AIPIKGIGAV RGKYGLGGIT 

701 AHPIKRSQMG AIALPKGKSA VSDNFADAAY AKYPSPYHSR NIRSNLEQRY 

751 GKENITSSTV PPSNGKNVKL ADQRHPKTGV PFDGKGFPNF EKHVKYDTLE 

801 HHHHHH* 

30 961-741 

1 ATGGCCACAA ACGACGACGA TGTTAAAAAA GCTGCCACTG TGGCCATTGC 

51 TGCTGCCTAC AACAATGGCC AAGAAATCAA CGGTTTCAAA GCTGGAGAGA 

101 CCATCTACGA CATTGATGAA GACGGCACAA TTACCAAAAA AGACGCAACT 

151 GCAGCCGATG TTGAAGCCGA CGACTTTAAA GGTCTGGGTC TGAAAAAAGT 

35 201 CGTGACTAAC CTGACCAAAA CCGTCAATGA AAACAAACAA AACGTCGATG 

251 CCAAAGTAAA AGCTGCAGAA TCTGAAATAG AAAAGTTAAC AACCAAGTTA 

301 GCAGACACTG ATGCCGCTTT AGCAGATACT GATGCCGCTC TGGATGCAAC 

351 CACCAACGCC TTGAATAAAT TGGGAGAAAA TATAACGACA TTTGCTGAAG 

401 AGACTAAGAC AAATATCGTA AAAATTGATG AAAAATTAGA AGCCGTGGCT 

40 451 GATACCGTCG ACAAGCATGC CGAAGCATTC AACGATATCG CCGATTCATT 

501 GGATGAAACC AACACTAAGG CAGACGAAGC CGTCAAAACC GCCAATGAAG 

551 CCAAACAGAC GGCCGAAGAA ACCAAACAAA ACGTCGATGC CAAAGTAAAA 

601 GCTGCAGAAA CTGCAGCAGG CAAAGCCGAA GCTGCCGCTG GCACAGCTAA 

651 TACTGCAGCC GACAAGGCCG AAGCTGTCGC TGCAAAAGTT ACCGACATCA 

45 701 AAGCTGATAT CGCTACGAAC AAAGATAATA TTGCTAAAAA AGCAAACAGT 

751 GCCGACGTGT ACACCAGAGA AGAGTCTGAC AGCAAATTTG TCAGAATTGA 

801 TGGTCTGAAC GCTACTACCG AAAAATTGGA CACACGCTTG GCTTCTGCTG 

851 AAAAATCCAT TGCCGATCAC GATACTCGCC TGAACGGTTT GGATAAAACA 

901 GTGTCAGACC TGCGCAAAGA AACCCGCCAA GGCCTTGCAG AACAAGCCGC 

50 951 GCTCTCCGGT CTGTTCCAAC CTTACAACGT GGGTCGGTTC AATGTAACGG 

1001 CTGCAGTCGG CGGCTACAAA TCCGAATCGG CAGTCGCCAT CGGTACCGGC 

1051 TTCCGCTTTA CCGAAAACTT TGCCGCCAAA GCAGGCGTGG CAGTCGGCAC 

1101 TTCGTCCGGT TCTTCCGCAG CCTACCATGT CGGCGTCAAT TACGAGTGGG 

1151 GATCCGGAGG GGGTGGTGTC GCCGCCGACA TCGGTGCGGG GCTTGCCGAT 

55 1201 GCACTAACCG CACCGCTCGA CCATAAAGAC AAAGGTTTGC AGTCTTTGAC 

1251 GCTGGATCAG TCCGTCAGGA AAAACGAGAA ACTGAAGCTG GCGGCACAAG 

1301 GTGCGGAAAA AACTTATGGA AACGGTGACA GCCTCAATAC GGGCAAATTG 

1351 AAGAACGACA AGGTCAGCCG TTTCGACTTT ATCCGCCAAA TCGAAGTGGA 

1401 CGGGCAGCTC ATTACCTTGG AGAGTGGAGA GTTCCAAGTA TACAAACAAA 

60 1451 GCCATTCCGC CTTAACCGCC TTTCAGACCG AGCAAATACA AGATTCGGAG 

1501 CATTCCGGGA AGATGGTTGC GAAACGCCAG TTCAGAATCG GCGACATAGC 

1551 GGGCGAACAT ACATCTTTTG ACAAGCTTCC CGAAGGCGGC AGGGCGACAT 

1601 ATCGCGGGAC GGCGTTCGGT TCAGACGATG CCGGCGGAAA ACTGACCTAC 

1651 ACCATAGATT TCGCCGCCAA GCAGGGAAAC GGCAAAATCG AACATTTGAA 

65 1701 ATCGCCAGAA CTCAATGTCG ACCTGGCCGC CGCCGATATC AAGCCGGATG 

1751 GAAAACGCCA TGCCGTCATC AGCGGTTCCG TCCTTTACAA CCAAGCCGAG 

1801 AAAGGCAGTT ACTCCCTCGG TATCTTTGGC GGAAAAGCCC AGGAAGTTGC 
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1851 CGGCAGCGCG GAAGTGAAAA 
1901 CCGCCAAGCA ACTCGAGCAC 



1 


MATNDDDVKK 


AATVAIAAAY 


51 


AADVEADDFK 


GLGLKKWTN 


101 


ADTDAALADT 


DAALDATTNA 


151 


DTVDKHABAF 


NDIADSLDET 


201 


AAETAAGKAE 


AAAGTANTAA 


251 


ADVYTREESD 


SKFVRIDGLN 


301 


VSDLRKETRQ 


GLABQAALSG 


351 


FRFTENFAAK 


AGVAVGTSSG 


401 


ALTAPLDHKD 


KGLQSLTLDQ 


451 


KNDKVSRFDF 


IRQIEVDGQL 


501 


HSGKMVAKRQ 


FRIGDIAGEH 


551 


TIDFAAKQGN 


GKIEHLKSPE 


601 


KGSYSLGIFG 


GKAQEVAGSA 



CCGTAAACGG CATACGCCAT ATCGGCCTTG 
CACCACCACC ACCACTGA 

NNGQEINGFK AGBTIYDIDE DGTITKKDAT 
LTKTVNENKQ NVDAKVKAAE SEIEKLTTKL 
LNKLGENITT FAEETKTNIV KIDEKLEAVA 
NTKADEAVKT ANEAKQTAEE TKQNVDAKVK 
DKAEAVAAKV TDIKADIATN KDNIAKKANS 
ATTEKLDTRL ASAEKSIADH DTRLNGLDKT 
LFQPYNVGRF NVTAAVGGYK SESAVAIGTG 
SSAAYHVGVN YEWGSGGGGV AADIGAGLAD 
SVRKNEKLKL AAQGAEKTYG NGDSLNTGKL 
ITLESGEFQV YKQSHSALTA FQTBQIQDSE 
TSFDKLPEGG RATYRGTAFG SDDAGGKLTY 
LNVDLAAADI KPDGKRHAVT SGSVLYNQAB 
EVKTVNGIRH IGLAAKQLEH HHHHH* 
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961-983 

1 
51 

101 

151 

201 

251 

301 

351 

401 

451 

501 

551 

601 

651 

701 

751 

801 

851 

901 

951 
1001 
1051 
1101 
1151 
1201 
1251 
1301 
1351 
1401 
1451 
1501 
1551 
1601 
1651 
1701 
1751 
1801 
1851 
1901 
1951 
2001 
2051 
2101 
2151 
2201 
2251 
2301 
2351 



ATGGCCACAA 
TGCTGCCTAC 
CCATCTACGA 
GCAGCCGATG 
CGTGACTAAC 
CCAAAGTAAA 
GCAGACACTG 
CACCAACGCC 
AGACTAAGAC 
GATACCGTCG 
GGATGAAACC 
CCAAACAGAC 
GCTGCAGAAA 
TACTGCAGCC 
AAGCTGATAT 
GCCGACGTGT 
TGGTCTGAAC 
AAAAATCCAT 
GTGTCAGACC 
GCTCTCCGGT 
CTGCAGTCGG 
TTCCGCTTTA 
TTCGTCCGGT 
GATCCGGCGG 
GGTATCGGCA 
TTACGCCGGT 
CCGGTCGGGA 
CCCCCCCCGA 
CAAGAATTTG 
GCGGGGTAGA 
TCCTTTCCCG 
CAAAAACTAT 
GTAAAGACAT 
GCAAAGCCGA 
GGTCTCCCAT 
GTATTGCGCC 
AAGAACGAAA 
CGAACGTGGC 
CAGGCACTGC 
CAAGCGTTGC 
CCTGATGCAA 
AAAACATGCT 
AACACATATG 
TATCACAGTC 
TGTATGGAGA 
GGAATTACTG 
TTTCACCCGT 
CCATCGTAAC 



ACGACGACGA 
AACAATGGCC 
CATTGATGAA 
TTGAAGCCGA 
CTGACCAAAA 
AGCTGCAGAA 
ATGCCGCTTT 
TTGAATAAAT 
AAATATCGTA 
ACAAGCATGC 
AACACTAAGG 
GGCCGAAGAA 
CTGCAGCAGG 
GACAAGGCCG 
CGCTACGAAC 
ACACCAGAGA 
GCTACTACCG 
TGCCGATCAC 
TGCGCAAAGA 
CTGTTCCAAC 
CGGCTACAAA 
CCGAAAACTT 
TCTTCCGCAG 
AGGCGGCACT 
GCAACAGCAG 
ATCAAGAACG 
TGACGTTGCG 
ATCTGCATAC 
ATCAACCTCA 
GGTAGGTATC 
AACTGTATGG 
ACGGCGTATA 
TGAAGCTTCT 
CGGATATCCG 
ATTATTGGCG 
CGATGCGACG 
TGATGGTTGC 
GTGCGCATCG 
CGACCTTTTC 
TCGACTATTC 
CAGAGCGATT 
TTTCATCTTT 
CCCTATTGCC 
GCAGGCGTAG 
ACCGGGTACA 
CCATGTGGTG 
ACAAACCCGA 
CGGCACGGCG 



TGTTAAAAAA 
AAGAAATCAA 
GACGGCACAA 
CGACTTTAAA 
CCGTCAATGA 
TCTGAAATAG 
AGCAGATACT 
TGGGAGAAAA 
AAAATTGATG 
CGAAGCATTC 
CAGACGAAGC 
ACCAAACAAA 
CAAAGCCGAA 
AAGCTGTCGC 
AAAGATAATA 
AGAGTCTGAC 
AAAAATTGGA 
GATACTCGCC 
AACCCGCCAA 
CTTACAACGT 
TCCGAATCGG 
TGCCGCCAAA 
CCTACCATGT 
TCTGCGCCCG 
AGCAACAACA 
AAATGTGGAA 
GTTACAGACA 
CGGAGACTTT 
AACCTGCAAT 
GTCGACACAG 
CAGAAAAGAA 
TGCGGAAGGA 
TTCGACGATG 
CCACGTAAAA 
GGCGTTCCGT 
CTACACATAA 
AGCCATCCGC 
TCAATAACAG 
CAAATAGCCA 
CGGCGGTGAT 
ACGGCAACCT 
TCGACAGGCA 
ATTTTATGAA 
ACCGCAGTGG 
GAACCGCTTG 
CCTGTCGGCA 
TTCAAATTGC 
GCTCTGCTGC 



GCTGCCACTG 
CGGTTTCAAA 
TTACCAAAAA 
GGTCTGGGTC 
AAACAAACAA 
AAAAGTTAAC 
GATGCCGCTC 
TATAACGACA 
AAAAATTAGA 
AACGATATCG 
CGTCAAAACC 
ACGTCGATGC 
GCTGCCGCTG 
TGCAAAAGTT 
TTGCTAAAAA 
AGCAAATTTG 
CACACGCTTG 
TGAACGGTTT 
GGCCTTGCAG 
GGGTCGGTTC 
CAGTCGCCAT 
GCAGGCGTGG 
CGGCGTCAAT 
ACTTCAATGC 
GCGAAATCAG 
AGACAGAAGC 
GGGATGCCAA 
CCAAACCCAA 
TGAAGCAGGC 
GCGAATCCGT 
CACGGCTATA 
AGCGCCTGAA 
AGGCCGTTAT 
GAAATCGGAC 
GGACGGCAGA 
TGAATACGAA 
AATGCATGGG 
TTTTGGAACA 
ATTCGGAGGA 
AAAACAGACG 
GTCCTACCAC 
ATGACGCACA 
AAAGACGCTC 
AGAAAAGTTC 
AGTATGGCTC 
CCCTATGAAG 
CGGAACATCC 
TGCAGAAATA 



TGGCCATTGC 
GCTGGAGAGA 
AGACGCAACT 
TGAAAAAAGT 
AACGTCGATG 
AACCAAGTTA 
TGGATGCAAC 
TTTGCTGAAG 
AGCCGTGGCT 
CCGATTCATT 
GCCAATGAAG 
CAAAGTAAAA 
GCACAGCTAA 
ACCGACATCA 
AGCAAACAGT 
TCAGAATTGA 
GCTTCTGCTG 
GGATAAAACA 
AACAAGCCGC 
AATGTAACGG 
CGGTACCGGC 
CAGTCGGCAC 
TACGAGTGGG 
AGGCGGTACC 
CAGCAGTATC 
ATGCTCTGTG 
AATCAATGCC 
ATGACGCATA 
TATACAGGAC 
CGGCAGCATA 
ACGAAAATTA 
GACGGAGGCG 
AGAGACTGAA 
ACATCGATTT 
CCTGCAGGCG 
TGATGAAACC 
TCAAGCTGGG 
ACATCGAGGG 
GCAGTACCGC 
AGGGTATCCG 
ATCCGTAATA 
AGCTCAGCCC 
AAAAAGGCAT 
AAACGGGAAA 
CAACCATTGC 
CAAGCGTCCG 
TTTTCCGCAC 
CCCGTGGATG 
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2401 
2451 
2501 
2551 
2601 
2651 
2701 
2751 
2801 
2851 
2901 
2951 
3001 
3051 
3101 
3151 
3201 
3251 
3301 
3351 
3401 
3451 
3501 
3551 
3601 
3651 
3701 
3751 
3801 
3851 
3901 
3951 
4001 
4051 
4101 
4151 
4201 
4251 
4301 

1 
51 
101 
151 
201 
251 
301 
351 
401 
451 
501 
551 
601 
651 
701 
751 
801 
851 
901 
951 
1001 
1051 
1101 
1151 
1201 
1251 
1301 



AGCAACGACA 
TGCAGTCGGC 
AGGCCATGAA 
ACGAAAGGTA 
CACGGGCGGC 
ACAACACCTA 
TACGGCAACA 
TTATAACGG6 
ATCTGGCAGA 
GGCAGTCTGC 
ACTGCTGAAA 
CGGCACGCGG 
CCCTTCCTGA 
CATCGAAACC 
CAGCGGGCAG 
GCGGCACGGA 
ACACGCCGTA 
TGGATGCCTC 
GCCGACCGCA 
CGCAGCGGCA 
TCAACAGTCT 
GATATGCAGG 
CGGCACGGGT 
GGGAACAGGG 
GGCATTGCCG 
CATGGGACGC 
GCATTAGTCT 
CTCAAAGGCC 
CACCGGTGCG 
AGCTGGGCGC 
TTGACGGTCG 
CGCCGAAAAA 
GCACGCTGGT 
AAAGCCGTCC 
CGACTACACG 
AGACGGGGGC 
GCGGATGTCG 
CGCCGGTTCC 
ACCGGTTCCT 

MATNDDDVKK 
AADVEADDFK 
ADTDAALADT 
DTVDKHAEAF 
AAETAAGKAE 
ADVYTREESD 
VSDLRKETRQ 
FRFTENFAAK 
GIGSNSRATT 
PPPNLHTGDF 
SFPELYGRKE 
AKPTDIRHVK 
KNEMMVAAIR 
QALLDYSGGD 
NTYALLPFYE 
GITAMWCLSA 
SNDNLRTTLL 
TKGTSDIAYS 
YGNNKSDMRV 
GSLQLDGKGT 
PFLSAAKIGQ 
AARTASAAAH 
ADRTDMPGIR 
DMQGRRLKAV 
GIAAKTGENT 
LKGLFSYGRY 
LTVEGGLRYD 



ACCTGCGTAC 
GTGGACAGCA 
CGGACCCGCG 
CATCCGATAT 
CTGATCAAAA 
TACGGGCAAA 
ACAAATCGGA 
GCGGCATCCG 
TACCGACCAA 
AGCTGGACGG 
GTGGACGGTA 
CAAGGGGGCA 
GTGCCGCCAA 
GACGGCGGCC 
TGAAGGCGAC 
CTGCTTCGGC 
GAACAGGGCG 
CGAATCATCC 
CAGATATGCC 
GCCGTACAGC 
CGCCGCTACC 
GACGCCGCCT 
CTGCGCGTCA 
CGGTGTTGAA 
CGAAAACCGG 
AGCACATGGA 
GTTTGCAGGC 
TGTTCTCCTA 
GACGAACATG 
ACTGGGCGGT 
AAGGCGGTCT 
GGCAGTGCTT 
CGGACTCGCG 
TGTTTGGAAC 
GTAACGGGCG 
ACGCAATATG 
AATTCGGCAA 
AAACAGTACG 
CGAGCACCAC 

AATVAIAAAY 
GLGLKKWTN 
DAALDATTNA 
NDIADSLDET 
AAAGTANTAA 
SKFVRIDGLN 
GLAEQAALSG 
AGVAVGTSSG 
AKSAAVSYAG 
PNPNDAYKNL 
HGYNENYKNY 
EIGHIDLVSH 
NAWVKLGERG 
KTDEGIRLMQ 
KDAQKGIITV 
PYEASVRFTR 
TTAQDIGAVG 
FRNDISGTGG 
ETKGALIYNG 
LYTRLGKLLK 
DYSFFTNIET 
SAPAGLKHAV 
PYGATFRAAA 
SDGLDHNGTG 
TAAATLGMGR 
KNSISRSTGA 
LLKQDAFAEK 



CACGTTGCTG 
AGTTCGGCTG 
TCCTTTCCGT 
TGCCTACTCC 
AAGGCGGCAG 
ACCATTATCG 
TATGCGCGTC 
GCGGCAGCCT 
TCCGGCGCAA 
CAAAGGTACG 
CGGCGATTAT 
GGCTATCTCA 
AATCGGGCAG 
TGCTGGCTTC 
ACGCTGTCCT 
AGCGGCACAT 
GCAGCAATCT 
GCAACACCCG 
GGGCATCCGC 
ATGCGAATGC 
GTCTATGCCG 
GAAAGCCGTA 
TCGCGCAAAC 
GGCAAAATGC 
CGAAAATACG 
GCGAAAACAG 
ATACGGCACG 
CGGACGCTAC 
CGGAAGGCAG 
GTCAACGTTC 
GCGCTACGAC 
TGGGCTGGAG 
GGTCTGAAGC 
GGCGGGCGTG 
GCTTTACCGG 
CCGCACACCC 
CGGCTGGAAC 
GCAACCACAG 
CACCACCACC 

NNGQEINGFK 
LTKTVNENKQ 
LNKLGENITT 
NTKADEAVKT 
DKAEAVAAKV 
ATTEKLDTRL 
LFQPYNVGRF 
SSAAYHVGVN 
IKNEMCKDRS 
INLKPAIEAG 
TAYMRKEAPE 
IIGGRSVDGR 
VRIVNNSFGT 
QSDYGNLSYH 
AGVDRSGEKF 
TNPIQIAGTS 
VDSKFGWGLL 
LIKKGGSQLQ 
AASGGSLNSD 
VDGTAIIGGK 
DGGLLASLDS 
EQGGSNLENL 
AVQHANAADG 
LRVIAQTQQD 
STWSENSANA 
DEHAEGSVNG 
GSALGWSGNS 



ACGACGGCTC 
GGGACTGCTG 
TCGGCGACTT 
TTCCGTAACG 
CCAACTGCAA 
AAGGCGGTTC 
GAAACCAAAG 
GAACAGCGAC 
ACGAAACCGT 
CTGTACACAC 
CGGCGGCAAG 
ACAGTACCGG 
GATTATTCTT 
CCTCGACAGC 
ATTATGTCCG 
TCCGCGCCCG 
GGAAAACCTG 
AGACGGTTGA 
CCCTACGGCG 
CGCCGACGGT 
ACAGTACCGC 
TCGGACGGGT 
CCAACAGGAC 
GCGGCAGTAC 
ACAGCAGCCG 
TGCAAATGCA 
ATGCGGGCGA 
AAAAACAGCA 
CGTCAACGGC 
CGTTTGCCGC 
CTGCTCAAAC 
CGGCAACAGC 
TGTCGCAACC 
GAACGCGACC 
CGCGACTGCA 
GTCTGGTTGC 
GGCTTGGCAC 
CGGACGAGTC 
ACTGA 

AGETIYDIDE 
NVDAKVKAAE 
FAEETKTNIV 
ANEAKQTAEE 
TDIKADIATN 
ASAEKSIADH 
NVTAAVGGYK 
YEWGSGGGGT 
MLCAGRDDVA 
YTGRGVEVGI 
DGGGKDIEAS 
PAGGIAPDAT 
TSRAGTADLF 
IRNKNMLFIF 
KREMYGEPGT 
FSAPIVTGTA 
DAGKAMNGPA 
LHGNNTYTGK 
GIVYLADTDQ 
LYMSARGKGA 
VEKTAGSEGD 
MVELDASESS 
VRIFNSLAAT 
GGTWEQGGVE 
KTDSISLFAG 
TLMQLGALGG 
LTEGTLVGLA 



AGGACATCGG 
GATGCGGGTA 
TACCGCCGAT 
ACATTTCAGG 
CTGCACGGCA 
GCTGGTGTTG 
GTGCGCTGAT 
GGCATTGTCT 
ACACATCAAA 
GTTTGGGCAA 
CTGTACATGT 
ACGACGTGTT 
TCTTCACAAA 
GTCGAAAAAA 
TCGCGGCAAT 
CCGGTCTGAA 
ATGGTCGAAC 
AACTGCGGCA 
CAACTTTCCG 
GTACGCATCT 
CGCCCATGCC 
TGGACCACAA 
GGTGGAACGT 
CCAAACCGTC 
CCACACTGGG 
AAAACCGACA 
TATCGGCTAT 
TCAGCCGCAG 
ACGCTGATGC 
AACGGGAGAT 
AGGATGCATT 
CTCACTGAAG 
CTTGAGCGAT 
TGAACGGACG 
GCAACCGGCA 
CGGCCTGGGC 
GTTACAGCTA 
GGCGTAGGCT 



DGTITKKDAT 
SEIEKLTTKL 
KIDEKLEAVA 
TKQNVDAKVK 
KDNIAKKANS 
DTRLNGLDKT 
SESAVAIGTG 
SAPDFNAGGT 
VTDRDAKINA 
VDTGESVGSI 
FDDEAVIETE 
LHIMNTNDET 
QIANSEEQYR 
STGNDAQAQP 
EPLEYGSNHC 
ALLLQKYPWM 
SFPFGDFTAD 
TIIEGGSLVL 
SGANETVHIK 
GYLNSTGRRV 
TLSYYVRRGN 
ATPETVETAA 
VYADSTAAHA 
GKMRGSTQTV 
IRHDAGDIGY 
VNVPFAATGD 
GLKLSQPLSD 
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1351 KAVLFATAGV ERDLNGRDYT VTGGFTGATA ATGKTGARNM PHTRLVAGLG 
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961C-ORF46.1 

1 ATGGCCACAA 

51 TGCTGCCTAC 

101 CCATCTACGA 

151 GCAGCCGATG 

201 CGTGACTAAC 

251 CCAAAGTAAA 

301 GCAGACACTG 

351 CACCAACGCC 

401 AGACTAAGAC 

451 GATACCGTCG 

501 GGATGAAACC 

551 CCAAACAGAC 

601 GCTGCAGAAA 

651 TACTGCAGCC 

701 AAGCTGATAT 

751 GCCGACGTGT 

801 TGGTCTGAAC 

851 AAAAATCCAT 

901 GTGTCAGACC 

951 GCTCTCCGGT 

1001 GATCAGATTT 

1051 CATTTCGAAC 

1101 TGCCGAGCGC 

1151 TGGGCAACCT 

1201 ATTGTCCGCT 

1251 CCATGCCTCA 

1301 TTAGCCTTTA 

1351 GGCTATGACG 

1401 GGATATATAC 

1451 ACCTGACCGA 

1501 AATGCCGGTA 

1551 CACCCGATAC 

1601 TCAACGGCAC 

1651 ATTGTCGGCG 

1701 TGCTGTCATG 

1751 GCATCAACGA 

1801 GCCATCCGCG 

1851 AGCCGTCAGC 

1901 CTGTTCGGGG 

1951 CGGTCGCAGA 

2001 CGACAATTTT 

2051 CCCGAAATAT 

2101 ACCTCCTCAA 

2151 CCAACGCCAC 

2201 ATTTTGAGAA 

2251 CACTGA 



ACGACGACGA 
AACAATGGCC 
CATTGATGAA 
TTGAAGCCGA 
CTGACCAAAA 
AGCTGCAGAA 
ATGCCGCTTT 
TTGAATAAAT 
AAATATCGTA 
ACAAGCATGC 
AACACTAAGG 
GGCCGAAGAA 
CTGCAGCAGG 
GACAAGGCCG 
CGCTACGAAC 
ACACCAGAGA 
GCTACTACCG 
TGCCGATCAC 
TGCGCAAAGA 
CTGTTCCAAC 
GGCAAACGAT 
CCGACGGGAA 
AGCGGCCATA 
GATGATTCAA 
TTTCCGATCA 
CATTCCGATT 
CCGCATCCAT 
GGCCACAGGG 
AGCTACGACA 
CAACCGCAGC 
GTATGCTGAC 
AGCCCCGAGC 
TGCAGATATC 
CAGGCGATGC 
CACGGCTTGG 
TTTGGCAGAT 
ATTGGGCAGT 
AATATCTTTA 
AAAATACGGC 
TGGGCGCGAT 
GCCGATGCGG 
CCGTTCAAAC 
CCGTGCCGCC 
CCGAAGACAG 
GCACGTGAAA 



TGTTAAAAAA 
AAGAAATCAA 
GACGGCACAA 
CGACTTTAAA 
CCGTCAATGA 
TCTGAAATAG 
AGCAGATACT 
TGGGAGAAAA 
AAAATTGATG 
CGAAGCATTC 
CAGACGAAGC 
ACCAAACAAA 
CAAAGCCGAA 
AAGCTGTCGC 
AAAGATAATA 
AGAGTCTGAC 
AAAAATTGGA 
GATACTCGCC 
AACCCGCCAA 
CTTACAACGT 
TCTTTTATCC 
ATACCACCTA 
TCGGATTGGG 
CAGGCGGCCA 
CGGGCACGAA 
CTGATGAAGC 
TGGGACGGAT 
CGGCGGCTAT 
TAAAAGGCGT 
ACCGGACAAC 
GCAAGGAGTA 
TGGACAGATC 
GTTAAAAACA 
CGTGCAGGGC 
GTCTGCTTTC 
ATGGCGCAAC 
CCAAAACCCC 
TGGCAGCCAT 
TTGGGCGGCA 
CGCATTGCCG 
CATACGCCAA 
TTGGAGCAGC 
GTCAAACGGC 
GCGTACCGTT 
TATGATACGC 



GCTGCCACTG 
CGGTTTCAAA 
TTACCAAAAA 
GGTCTGGGTC 
AAACAAACAA 
AAAAGTTAAC 
GATGCCGCTC 
TATAACGACA 
AAAAATTAGA 
AACGATATCG 
CGTCAAAACC 
ACGTCGATGC 
GCTGCCGCTG 
TGCAAAAGTT 
TTGCTAAAAA 
AGCAAATTTG 
CACACGCTTG 
TGAACGGTTT 
GGCCTTGCAG 
GGGTGGATCC 
GGCAGGTTCT 
TTCGGCAGCA 
AAAAATACAA 
TTAAAGGAAA 
GTCCATTCCC 
CGGTAGTCCC 
ACGAACACCA 
CCCGCTCCCA 
TGCCCAAAAT 
GGCTTGCCGA 
GGCGACGGAT 
GGGCAATGCC 
TCATCGGCGC 
ATAAGCGAAG 
CACCGAAAAC 
TCAAAGACTA 
AATGCCGCAC 
CCCCATCAAA 
TCACGGCACA 
AAAGGGAAAT 
ATACCCGTCC 
GTTACGGCAA 
AAAAATGTCA 
TGACGGTAAA 
TCGAGCACCA 



TGGCCATTGC 
GCTGGAGAGA 
AGACGCAACT 
TGAAAAAAGT 
AACGTCGATG 
AACCAAGTTA 
TGGATGCAAC 
TTTGCTGAAG 
AGCCGTGGCT 
CCGATTCATT 
GCCAATGAAG 
CAAAGTAAAA 
GCACAGCTAA 
ACCGACATCA 
AGCAAACAGT 
TCAGAATTGA 
GCTTCTGCTG 
GGATAAAACA 
AACAAGCCGC 
GGAGGAGGAG 
CGACCGTCAG 
GGGGGGAACT 
AGCCATCAGT 
TATCGGCTAC 
CCTTCGACAA 
GTTGACGGAT 
TCCCGCCGAC 
AAGGCGCGAG 
ATCCGCCTCA 
CCGTTTCCAC 
TCAAACGCGC 
GCCGAAGCCT 
GGCAGGAGAA 
GCTCAAACAT 
AAGATGGCGC 
TGCCGCAGCA 
AAGGCATAGA 
GGGATTGGAG 
TCCTATCAAG 
CCGCCGTCAG 
CCTTACCATT 
AGAAAACATC 
AACTGGCAGA 
GGGTTTCCGA 
CCACCACCAC 
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60 
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i 

51 
101 
151 
201 
251 
301 
351 
401 
451 
501 
551 
601 
651 
701 



MATNDDDVKK 
AADVEADDFK 
ADTDAALADT 
DTVDKHAEAF 
AAETAAGKAE 
ADVYTREESD 
VSDLRKETRQ 
HFEPDGKYHL 
IVRFSDHGHE 
GYDGPQGGGY 
NAGSMLTQGV 
XVGAGDAVQG 
AIRDWAVQNP 
RSQMGAIALP 
TSSTVPPSNG 



AATVAIAAAY 
GLGIiKKWTN 
DAALDATTNA 
NDIADSLDET 
AAAGTANTAA 
SKFVRIDGLN 
GLAEQAALSG 
PGSRGELAER 
VHSPFDNHAS 
PAPKGARD1Y 
GDGFKRATRY 
ISEGSNIAVM 
NAAQGIEAVS 
KGKSAVSDNF 
KNVKLADQRH 



NNGQEINGFK 
LTKTVNENKQ 
LNKLGENITT 
NTKADEAVKT 
DKAEAVAAKV 
ATTEKLDTRL 
LFQPYNVGGS 
SGHIGLGKIQ 
HSDSDEAGSP 
SYDIKGVAQN 
SPELDRSGNA 
HGLGLLSTEN 
NIFMAAIPIK 
ADAAYAKYPS 
PKTGVPFDGK 



AGETIYDIDE 
NVDAKVKAAE 
FAEETKTNXV 
ANEAKQTAEE 
TDIKADIATN 
ASAEKS I ADH 
GGGGS DLAND 
SHQLGNLMIQ 
VDGFSLYRIH 
IRLNLTDNRS 
AEAFNGTADI 
KMARINDLAD 
GIGAVRGKYG 
PYHSRNIRSN 
GFPNFEKHVK 



DGTITKKDAT 
SEIEKLTTKL 
KIDEKLEAVA 
TKQNVDAKVK 
KDNIAKKANS 
DTRLNGLDKT 
SFIRQVLDRQ 
QAAIKGNIGY 
WDGYEHHPAD 
TGQRLADRFH 
VKNIIGAAGE 
MAQLKDYAAA 
LGGITAHPIK 
LEQRYGKENI 
YDTLEHHHHH 
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961C-741 

1 ATGGCCACAA 

51 TGCTGCCTAC 

101 CCATCTACGA 

151 GCAGCCGATG 

201 CGTGACTAAC 

251 CCAAAGTAAA 

301 GCAGACACTG 

351 CACCAACGCC 

401 AGACTAAGAC 

451 GATACCGTCG 

501 GGATGAAACC 

551 CCAAACAGAC 

601 GCTGCAGAAA 

651 TACTGCAGCC 

701 AAGCTGATAT 

751 GCCGACGTGT 

801 TGGTCTGAAC 

851 AAAAATCCAT 

901 GTGTCAGACC 

951 GCTCTCCGGT 

1001 GTGTCGCCGC 

1051 CTCGACCATA 

1101 CAGGAAAAAC 

1151 ATGGAAACGG 

1201 AGCCGTTTCG 

1251 CTTGGAGAGT 

1301 CCGCCTTTCA 

1351 GTTGCGAAAC 

1401 TTTTGACAAG 

1451 TCGGTTCAGA 

1501 GCCAAGCAGG 

1551 TGTCGACCTG 

1601 TCATCAGCGG 

1651 CTCGGTATCT 

1701 GAAAACCGTA 

1751 AGCACCACCA 



ACGACGACGA 
AACAATGGCC 
CATTGATGAA 
TTGAAGCCGA 
CTGACCAAAA 
AGCTGCAGAA 
ATGCCGCTTT 
TTGAATAAAT 
AAATATCGTA 
ACAAGCATGC 
AACACTAAGG 
GGCCGAAGAA 
CTGCAGCAGG 
GACAAGGCCG 
CGCTACGAAC 
ACACCAGAGA 
GCTACTACCG 
TGCCGATCAC 
TGCGCAAAGA 
CTGTTCCAAC 
CGACATCGGT 
AAGACAAAGG 
GAGAAACTGA 
TGACAGCCTC 
ACTTTATCCG 
GGAGAGTTCC 
GACCGAGCAA 
GCCAGTTCAG 
CTTCCCGAAG 
CGATGCCGGC 
GAAACGGCAA 
GCCGCCGCCG 
TTCCGTCCTT 
TTGGCGGAAA 
AACGGCATAC 
CCACCACCAC 



TGTTAAAAAA 
AAGAAATCAA 
GACGGCACAA 
CGACTTTAAA 
CCGTCAATGA 
TCTGAAATAG 
AGCAGATACT 
TGGGAGAAAA 
AAAATTGATG 
CGAAGCATTC 
CAGACGAAGC 
ACCAAACAAA 
CAAAGCCGAA 
AAGCTGTCGC 
AAAGATAATA 
AGAGTCTGAC 
AAAAATTGGA 
GATACTCGCC 
AACCCGCCAA 
CTTACAACGT 
GCGGGGCTTG 
TTTGCAGTCT 
AGCTGGCGGC 
AATACGGGCA 
CCAAATCGAA 
AAGTATACAA 
ATACAAGATT 
AATCGGCGAC 
GCGGCAGGGC 
GGAAAACTGA 
AATCGAACAT 
ATATCAAGCC 
TACAACCAAG 
AGCCCAGGAA 
GCCATATCGG 
TGA 



GCTGCCACTG 
CGGTTTCAAA 
TTACCAAAAA 
GGTCTGGGTC 
AAACAAACAA 
AAAAGTTAAC 
GATGCCGCTC 
TATAACGACA 
AAAAATTAGA 
AACGATATCG 
CGTCAAAACC 
ACGTCGATGC 
GCTGCCGCTG 
TGCAAAAGTT 
TTGCTAAAAA 
AGCAAATTTG 
CACACGCTTG 
TGAACGGTTT 
GGCCTTGCAG 
GGGTGGATCC 
CCGATGCACT 
TTGACGCTGG 
ACAAGGTGCG 
AATTGAAGAA 
GTGGACGGGC 
ACAAAGCCAT 
CGGAGCATTC 
ATAGCGGGCG 
GACATATCGC 
CCTACACCAT 
TTGAAATCGC 
GGATGGAAAA 
CCGAGAAAGG 
GTTGCCGGCA 
CCTTGCCGCC 



TGGCCATTGC 
GCTGGAGAGA 
AGACGCAACT 
TGAAAAAAGT 
AACGTCGATG 
AACCAAGTTA 
TGGATGCAAC 
TTTGCTGAAG 
AGCCGTGGCT 
CCGATTCATT 
GCCAATGAAG 
CAAAGTAAAA 
GCACAGCTAA 
ACCGACATCA 
AGCAAACAGT 
TCAGAATTGA 
GCTTCTGCTG 
GGATAAAACA 
AACAAGCCGC 
GGAGGGGGTG 
AACCGCACCG 
ATCAGTCCGT 
GAAAAAACTT 
CGACAAGGTC 
AGCTCATTAC 
TCCGCCTTAA 
CGGGAAGATG 
AACATACATC 
GGGACGGCGT 
AGATTTCGCC 
CAGAACTCAA 
CGCCATGCCG 
CAGTTACTCC 
GCGCGGAAGT 
AAGCAACTCG 
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50 
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i 

51 
101 
151 
201 
251 
301 
351 
401 
451 
501 
551 



MATNDDDVKK 
AADVEADDFK 
ADTDAALADT 
DTVDKHAEAF 
AAETAAGKAE 
ADVYTREESD 
VSDLRKETRQ 
LDHKDKGLQS 
SRFDFIRQIE 
VAKRQFRIGD 
AKQGNGKIEH 
LGIFGGKAQE 



961C-983 



1 


ATGGCCACAA 


51 


TGCTGCCTAC 


101 


CCATCTACGA 


151 


GCAGCCGATG 


201 


CGTGACTAAC 


251 


CCAAAGTAAA 


301 


GCAGACACTG 


351 


CACCAACGCC 


401 


AGACTAAGAC 


451 


GATACCGTCG 


501 


GGATGAAACC 



AATVAIAAAY 
GLGXjKKWTN 
DAAIiDATTNA 
NDIADSLDET 
AAAGTANTAA 
SKFVRIDGLN 
GLAEQAALSG 
LTLDQSVRKN 
VDGQLITLES 
IAGEHTSFDK 
LKSPELNVDL 
VAGSAEVKTV 



ACGACGACGA 
AACAATGGCC 
CATTGATGAA 
TTGAAGCCGA 
CTGACCAAAA 
AGCTGCAGAA 
ATGCCGCTTT 
TTGAATAAAT 
AAATATCGTA 
ACAAGCATGC 
AACACTAAGG 



NNGQEINGFK 
LTKTVNENKQ 
LNKLGENITT 
NTKADEAVKT 
DKAEAVAAKV 
ATTEKLDTRL 
LFQPYNVGGS 
EKLKLAAQGA 
GEFQVYKQSH 
LPEGGRATYR 
AAADIKPDGK 
NGIRHIGLAA 



TGTTAAAAAA 
AAGAAATCAA 
GACGGCACAA 
CGACTTTAAA 
CCGTCAATGA 
TCTGAAATAG 
AGCAGATACT 
TGGGAGAAAA 
AAAATTGATG 
CGAAGCATTC 
CAGACGAAGC 



AGETIYDIDE 
NVDAKVKAAE 
FAEETKTNIV 
ANEAKQTAEE 
TDIKADIATN 
ASAEKSIADH 
GGGGVAADIG 
EKTYGNGDSL 
SALTAFQTEQ 
GTAFGSDDAG 
RHAVISGSVL 
KQLEHHHHHH 



GCTGCCACTG 
CGGTTTCAAA 
TTACCAAAAA 
GGTCTGGGTC 
AAACAAACAA 
AAAAGTTAAC 
GATGCCGCTC 
TATAACGACA 
AAAAATTAGA 
AACGATATCG 
CGTCAAAACC 



DGTITKKDAT 
SEIEKLTTKL 
KIDEKLEAVA 
TKQNVDAKVK 
KDNIAKKANS 
DTRLNGLDKT 
AGLADALTAP 
NTGKLKNDKV 
IQDSEHSGKM 
GKLTYTIDFA 
YNQAEKGSYS 



TGGCCATTGC 
GCTGGAGAGA 
AGACGCAACT 
TGAAAAAAGT 
AACGTCGATG 
AACCAAGTTA 
TGGATGCAAC 
TTTGCTGAAG 
AGCCGTGGCT 
CCGATTCATT 
GCCAATGAAG 
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961C-741 

1 ATGGCCACAA 

51 TGCTGCCTAC 

101 CCATCTACGA 

151 GCAGCCGATG 

201 CGTGACTAAC 

251 CCAAAGTAAA 

301 GCAGACACTG 

351 CACCAACGCC 

401 AGACTAAGAC 

451 GATACCGTCG 

501 GGATGAAACC 

551 CCAAACAGAC 

601 GCTGCAGAAA 

651 TACTGCAGCC 

701 AAGCTGATAT 

751 GCCGACGTGT 

801 TGGTCTGAAC 

851 AAAAATCCAT 

901 GTGTCAGACC 

951 GCTCTCCGGT 

1001 GTGTCGCCGC 

1051 CTCGACCATA 

1101 CAGGAAAAAC 

1151 ATGGAAACGG 

1201 AGCCGTTTCG 

1251 CTTGGAGAGT 

1301 CCGCCTTTCA 

1351 GTTGCGAAAC 

1401 TTTTGACAAG 

1451 TCGGTTCAGA 

1501 GCCAAGCAGG 

1551 TGTCGACCTG 

1601 TCATCAGCGG 

1651 CTCGGTATCT 

1701 GAAAACCGTA 

1751 AGCACCACCA 



ACGACGACGA 
AACAATGGCC 
CATTGATGAA 
TTGAAGCCGA 
CTGACCAAAA 
AGCTGCAGAA 
ATGCCGCTTT 
TTGAATAAAT 
AAATATCGTA 
ACAAGCATGC 
AACACTAAGG 
GGCCGAAGAA 
CTGCAGCAGG 
GACAAGGCCG 
CGCTACGAAC 
ACACCAGAGA 
GCTACTACCG 
TGCCGATCAC 
TGCGCAAAGA 
CTGTTCCAAC 
CGACATCGGT 
AAGACAAAGG 
GAGAAACTGA 
TGACAGCCTC 
ACTTTATCCG 
GGAGAGTTCC 
GACCGAGCAA 
GCCAGTTCAG 
CTTCCCGAAG 
CGATGCCGGC 
GAAACGGCAA 
GCCGCCGCCG 
TTCCGTCCTT 
TTGGCGGAAA 
AACGGCATAC 
CCACCACCAC 



TGTTAAAAAA 
AAGAAATCAA 
GACGGCACAA 
CGACTTTAAA 
CCGTCAATGA 
TCTGAAATAG 
AGCAGATACT 
TGGGAGAAAA 
AAAATTGATG 
CGAAGCATTC 
CAGACGAAGC 
ACCAAACAAA 
CAAAGCCGAA 
AAGCTGTCGC 
AAAGATAATA 
AGAGTCTGAC 
AAAAATTGGA 
GATACTCGCC 
AACCCGCCAA 
CTTACAACGT 
GCGGGGCTTG 
TTTGCAGTCT 
AGCTGGCGGC 
AATACGGGCA 
CCAAATCGAA 
AAGTATACAA 
ATACAAGATT 
AATCGGCGAC 
GCGGCAGGGC 
GGAAAACTGA 
AATCGAACAT 
ATATCAAGCC 
TACAACCAAG 
AGCCCAGGAA 
GCCATATCGG 
TGA 



GCTGCCACTG 
CGGTTTCAAA 
TTACCAAAAA 
GGTCTGGGTC 
AAACAAACAA 
AAAAGTTAAC 
GATGCCGCTC 
TATAACGACA 
AAAAATTAGA 
AACGATATCG 
CGTCAAAACC 
ACGTCGATGC 
GCTGCCGCTG 
TGCAAAAGTT 
TTGCTAAAAA 
AGCAAATTTG 
CACACGCTTG 
TGAACGGTTT 
GGCCTTGCAG 
GGGTGGATCC 
CCGATGCACT 
TTGACGCTGG 
ACAAGGTGCG 
AATTGAAGAA 
GTGGACGGGC 
ACAAAGCCAT 
CGGAGCATTC 
ATAGCGGGCG 
GACATATCGC 
CCTACACCAT 
TTGAAATCGC 
GGATGGAAAA 
CCGAGAAAGG 
GTTGCCGGCA 
CCTTGCCGCC 



TGGCCATTGC 
GCTGGAGAGA 
AGACGCAACT 
TGAAAAAAGT 
AACGTCGATG 
AACCAAGTTA 
TGGATGCAAC 
TTTGCTGAAG 
AGCCGTGGCT 
CCGATTCATT 
GCCAATGAAG 
CAAAGTAAAA 
GCACAGCTAA 
ACCGACATCA 
AGCAAACAGT 
TCAGAATTGA 
GCTTCTGCTG 
GGATAAAACA 
AACAAGCCGC 
GGAGGGGGTG 
AACCGCACCG 
ATCAGTCCGT 
GAAAAAACTT 
CGACAAGGTC 
AGCTCATTAC 
TCCGCCTTAA 
CGGGAAGATG 
AACATACATC 
GGGACGGCGT 
AGATTTCGCC 
CAGAACTCAA 
CGCCATGCCG 
CAGTTACTCC 
GCGCGGAAGT 
AAGCAACTCG 
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51 
101 
151 
201 
251 
301 
351 
401 
451 
501 
551 



MATNDDDVKK 
AADVEADDFK 
ADTDAALADT 
DTVDKHAEAF 
AAETAAGKAE 
ADVYTREESD 
VSDLRKETRQ 
LDHKDKGLQS 
SRFDFIRQIB 
VAKRQFRIGD 
AKQGNGKIEH 
LGIFGGKAQE 



961C-983 

1 ATGGCCACAA 

51 TGCTGCCTAC 

101 CCATCTACGA 

151 GCAGCCGATG 

201 CGTGACTAAC 

251 CCAAAGTAAA 

301 GCAGACACTG 

351 CACCAACGCC 

401 AGACTAAGAC 

451 GATACCGTCG 

501 GGATGAAACC 



AATVAIAAAY 
GLGLKKWTN 
DAALDATTNA 
NDIADSLDET 
AAAGTANTAA 
SKFVRIDGLN 
GLAEQAALSG 
LTLDQSVRKN 
VDGQLITLES 
IAGEHTSFDK 
LKSPELNVDL 
VAGSAEVKTV 



ACGACGACGA 
AACAATGGCC 
CATTGATGAA 
TTGAAGCCGA 
CTGACCAAAA 
AGCTGCAGAA 
ATGCCGCTTT 
TTGAATAAAT 
AAATATCGTA 
ACAAGCATGC 
AACACTAAGG 



NNGQEINGFK 
LTKTVNENKQ 
LNKLGENITT 
NTKADEAVKT 
DKAEAVAAKV 
ATTEKLDTRL 
LFQPYNVGGS 
EKLKLAAQGA 
GEFQVYKQSH 
LPEGGRATYR 
AAADIKPDGK 
NGIRHIGLAA 



TGTTAAAAAA 
AAGAAATCAA 
GACGGCACAA 
CGACTTTAAA 
CCGTCAATGA 
TCTGAAATAG 
AGCAGATACT 
TGGGAGAAAA 
AAAATTGATG 
CGAAGCATTC 
CAGACGAAGC 



AGETIYDIDE 
NVDAKVKAAE 
FAEETKTNIV 
ANEAKQTAEE 
TDIKADIATN 
ASAEKSIADH 
GGGGVAADIG 
EKTYGNGDSL 
SALTAFQTEQ 
GTAFGSDDAG 
RHAVISGSVL 
KQLEHHHHHH 



GCTGCCACTG 
CGGTTTCAAA 
TTACCAAAAA 
GGTCTGGGTC 
AAACAAACAA 
AAAAGTTAAC 
GATGCCGCTC 
TATAACGACA 
AAAAATTAGA 
AACGATATCG 
CGTCAAAACC 



DGTITKKDAT 
SEIEKLTTKL 
KIDEKLEAVA 
TKQNVDAKVK 
KDNIAKKANS 
DTRLNGLDKT 
AGLADALTAP 
NTGKLKNDKV 
IQDSEHSGKM 
GKLTYTIDFA 
YNQAEKGSYS 



TGGCCATTGC 
GCTGGAGAGA 
AGACGCAACT 
TGAAAAAAGT 
AACGTCGATG 
AACCAAGTTA 
TGGATGCAAC 
TTTGCTGAAG 
AGCCGTGGCT 
CCGATTCATT 
GCCAATGAAG 
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3901 GCAACGGCGG GCGTGGAACG CGACCTGAAC GGACGCGACT ACACGGTAAC 

3951 GGGCGGCTTT ACCGGCGCGA CTGCAGCAAC CGGCAAGACG GGGGCACGCA 

4001 ATATGCCGCA CACCCGTCTG GTTGCCGGCC TGGGCGCGGA TGTCGAATTC 

4051 GGCAACGGCT GGAACGGCTT GGCACGTTAC AGCTACGCCG GTTCCAAACA 

4101 GTACGGCAAC CACAGCGGAC GAGTCGGCGT AGGCTACCGG TTCCTCGAGC 

4151 ACCACCACCA CCACCACTGA 

1 MATNDDDVKK AATVAIAAAY NNGQEINGFK AGETIYDIDE DGTITKKDAT 

51 AADVEADDFK GLGLKKWTN LTKTVNENKQ NVDAKVKAAE SEIEKLTTKL 

101 ADTDAALADT DAALDATTNA LNKLGENITT PAEETKTNIV KIDEKLEAVA 

151 DTVDKHAEAF NDIADSLDET NTKADEAVKT ANEAKQTAEE TKQNVDAKVK 

201 AAETAAGKAE AAAGTANTAA DKAEAVAAKV TDIKADIATN KDNIAKKANS 

251 ADVYTREESD SKFVRHX5LN ATTEKLDTRL ASAEKSIADH DTRLNGLDKT 

301 VSDLRKETRQ GLAEQAALSG LFQPYNVGGS GGGGTSAPDP NAGGTGIGSN 

351 SRATTAKSAA VSYAGIKNEM CKDRSMLCAG RDDVAVTDRD AKINAPPPNL 

401 HTGDFPNPND AYKNLINLKP AIEAGYTGRG VEVGIVDTGE SVGSISFPEL 

451 YGRKEHGYNE NYKNYTAYMR KKAPEDGGGK DIEASFDDEA VIETEAKPTD 

501 IRHVKEIGHI DLVSHI1GGR SVDGRPAGGI APDATLHIMN TNDETKNEMM 

551 VAAIRNAWVK LGERGVRIVN NSFGTTSRAG TADLFQIANS EEQYRQALLD 

601 YSGGDKTDEG IRLMQQSDYG NLSYHIRNKN MLFIFSTGND AQAQPNTYAL 

651 LPFYEKDAQK GIITVAGVDR SGEKFKREMY GEPGTEPLEY GSNHCGITAM 

701 WCLSAPYEAS VRFTRTNPIQ IAGTSFSAPI VTGTAALLLQ KYPWMSNDNL 

751 RTTLLTTAQD IGAVGVDSKF GWGLLDAGKA MNGPASFPFG DFTADTKGTS 

801 DIAYSFRNDI SGTGGLIKKG GSQLQLHGNN TYTGKTIIEG GSLVLYGNNK 

851 SDMRVETKGA LIYNGAASGG SLNSDGIVYL ADTDQSGANE TVHIKGSLQL 

901 DGKGTLYTRL GKLLKVDGTA IIGGKLYMSA RGKGAGYLNS TGRRVPFLSA 

951 AKIGQDYSFF TNIETDGGLL ASLDSVEKTA GSEGDTLSYY VRRGNAARTA 

1001 SAAAHSAPAG LKEAVEQGGS NLENLMVELD ASESSATPET VETAAADRTD 

1051 MPGIRPYGAT FRAAAAVQHA NAADGVRIFN SLAATVYADS TAAHADMQGR 

1101 RLKAVSDGLD HNGTGLRVIA QTQQDGGTWE QGGVEGKMRG STQTVGIAAK 

1151 TGENTTAAAT LGMGRSTWSE NSANAKTDSI SLFAGIRHDA GDIGYLKGLF 

1201 SYGRYKNSIS RSTGADEHAE GSVNGTLMQL GALGGVNVPF AATGDLTVEG 

1251 GLRYDLLKQD AFAEKGSALG WSGNSLTEGT LVGLAGLKLS QPLSDKAVLF 

1301 ATAGVERDLN GRDYTVTGGF TGATAATGKT GARNMPHTRL VAGLGADVEF 

1351 GNGWNGIiARY SYAGSKQYGN HSGRVGVGYR FLEHHHHHH* 



961cL-ORF46,l 

1 ATGAAACACT TTCCATCCAA AGTACTGACC ACAGCCATCC TTGCCACTTT 

51 CTGTAGCGGC GCACTGGCAG CCACAAACGA CGACGATGTT AAAAAAGCTG 

101 CCACTGTGGC CATTGCTGCT GCCTACAACA ATGGCCAAGA AATCAACGGT 

151 TTCAAAGCTG GAGAGACCAT CTACGACATT GATGAAGACG GCACAATTAC 

201 CAAAAAAGAC GCAACTGCAG CCGATGTTGA AGCCGACGAC TTTAAAGGTC 

251 TGGGTCTGAA AAAAGTCGTG ACTAACCTGA CCAAAACCGT CAATGAAAAC 

301 AAACAAAACG TCGATGCCAA AGTAAAAGCT GCAGAATCTG AAATAGAAAA 

351 GTTAACAACC AAGTTAGCAG ACACTGATGC CGCTTTAGCA GATACTGATG 

401 CCGCTCTGGA TGCAACCACC AACGCCTTGA ATAAATTGGG AGAAAATATA 

451 ACGACATTTG CTGAAGAGAC TAAGACAAAT ATCGTAAAAA TTGATGAAAA 

501 ATTAGAAGCC GTGGCTGATA CCGTCGACAA GCATGCCGAA GCATTCAACG 

551 ATATCGCCGA TTCATTGGAT GAAACCAACA CTAAGGCAGA CGAAGCCGTC 

601 AAAACCGCCA ATGAAGCCAA ACAGACGGCC GAAGAAACCA AACAAAACGT 

651 CGATGCCAAA GTAAAAGCTG CAGAAACTGC AGCAGGCAAA GCCGAAGCTG 

701 CCGCTGGCAC AGCTAATACT GCAGCCGACA AGGCCGAAGC TGTCGCTGCA 

751 AAAGTTACCG ACATCAAAGC TGATATCGCT ACGAACAAAG ATAATATTGC 

801 TAAAAAAGCA AACAGTGCCG ACGTGTACAC CAGAGAAGAG TCTGACAGCA 

851 AATTTGTCAG AATTGATGGT CTGAACGCTA CTACCGAAAA ATTGGACACA 

901 CGCTTGGCTT CTGCTGAAAA ATCCATTGCC GATCACGATA CTCGCCTGAA 

951 CGGTTTGGAT AAAACAGTGT CAGACCTGCG CAAAGAAACC CGCCAAGGCC 

1001 TTGCAGAACA AGCCGCGCTC TCCGGTCTGT TCCAACCTTA CAACGTGGGT 

1051 GGATCCGGAG GAGGAGGATC AGATTTGGCA AACGATTCTT TTATCCGGCA 

1101 GGTTCTCGAC CGTCAGCATT TCGAACCCGA CGGGAAATAC CACCTATTCG 

1151 GCAGCAGGGG GGAACTTGCC GAGCGCAGCG GCCATATCGG ATTGGGAAAA 

1201 ATACAAAGCC ATCAGTTGGG CAACCTGATG ATTCAACAGG CGGCCATTAA 

1251 AGGAAATATC GGCTACATTG TCCGCTTTTC CGATCACGGG CACGAAGTCC 

1301 ATTCCCCCTT CGACAACCAT GCCTCACATT CCGATTCTGA TGAAGCCGGT 

1351 AGTCCCGTTG ACGGATTTAG CCTTTACCGC ATCCATTGGG ACGGATACGA 

1401 ACACCATCCC GCCGACGGCT ATGACGGGCC ACAGGGCGGC GGCTATCCCG 
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1451 
1501 
1551 
1601 
1651 
1701 
1751 
1801 
1851 
1901 
1951 
2001 
2051 
2101 
2151 
2201 
2251 
2301 

1 
51 
101 
151 
201 
251 
301 
351 
401 
451 
501 
551 
601 
651 
701 
751 



CTCCCAAAGG 
CAAAATATCC 
TGCCGACCGT 
ACGGATTCAA 
AATGCCGCCG 
CGGCGCGGCA 
GCGAAGGCTC 
GAAAACAAGA 
AGACTATGCC 
CCGCACAAGG 
ATCAAAGGGA 
GGCACATCCT 
GGAAATCCGC 
CCGTCCCCTT 
CGGCAAAGAA 
ATGTCAAACT 
GGTAAAGGGT 
CGAG 

MKHFPSKVLT 
FKAGETIYDI 
KQNVDAKVKA 
TTFAEETKTN 
KTANEAKQTA 
KVTDIKADIA 
RLASAEKSIA 
GSGGGGSDLA 
IQSHQLGNLM 
SPVDGFSLYR 
QNIRLNLTDN 
NAABAFNGTA 
ENKMARINDL 
IKGIGAVRGK 
PSPYHSRNIR 
GKGFPNFEKH 



CGCGAGGGAT 
GCCTCAACCT 
TTCCACAATG 
ACGCGCCACC 
AAGCCTTCAA 
GGAGAAATTG 
AAACATTGCT 
TGGCGCGCAT 
GCAGCAGCCA 
CATAGAAGCC 
TTGGAGCTGT 
ATCAAGCGGT 
CGTCAGCGAC 
ACCATTCCCG 
AACATCACCT 
GGCAGACCAA 
TTCCGAATTT 



TAILATFCSG 
DEDGTITKKD 
AESEIEKLTT 
IVKIDEKLEA 
EETKQNVDAK 
TNKDNIAKKA 
DHDTRLNGLD 
NDSFIRQVLD 
IQQAAIKGNI 
IHWDGYEHHP 
RSTGQRLADR 
DIVKNIIGAA 
ADMAQLKDYA 
YGLGGITAHP 
SNLEQRYGKE 
VKYDT* 



ATATACAGCT 
GACCGACAAC 
CCGGTAGTAT 
CGATACAGCC 
CGGCACTGCA 
TCGGCGCAGG 
GTCATGCACG 
CAACGATTTG 
TCCGCGATTG 
GTCAGCAATA 
TCGGGGAAAA 
CGCAGATGGG 
AATTTTGCCG 
AAATATCCGT 
CCTCAACCGT 
CGCCACCCGA 
TGAGAAGCAC 



ALAATNDDDV 
ATAADVEADD 
KLADTDAALA 
VADTVDKHAE 
VKAAETAAGK 
NSADVYTREE 
KTVSDLRKET 
RQHFEPDGKY 
GYIVRFSDHG 
ADGYDGPQGG 
FHNAGSMLTQ 
GEIVGAGDAV 
AAAIRDWAVQ 
IKRSQMGAIA 
NITSSTVPPS 



ACGACATAAA 
CGCAGCACCG 
GCTGACGCAA 
CCGAGCTGGA 
GATATCGTTA 
CGATGCCGTG 
GCTTGGGTCT 
GCAGATATGG 
GGCAGTCCAA 
TCTTTATGGC 
TACGGCTTGG 
CGCGATCGCA 
ATGCGGCATA 
TCAAACTTGG 
GCCGCCGTCA 
AGACAGGCGT 
GTGAAATATG 



KKAATVAIAA 
FKGLGLKKW 
DTDAALDATT 
AFNDIADSLD 
AEAAAGTANT 
SDSKFVRIDG 
RQGLAEQAAL 
HLFGSRGELA 
HEVHSPFDNH 
GYPAPKGARD 
GVGDGFKRAT 
QGISEGSNIA 
NPNAAQGIEA 
LPKGKSAVSD 
NGKNVKLADQ 



AGGCGTTGCC 
GACAACGGCT 
GGAGTAGGCG 
CAGATCGGGC 
AAAACATCAT 
CAGGGCATAA 
GCTTTCCACC 
CGCAACTCAA 
AACCCCAATG 
AGCCATCCCC 
GCGGCATCAC 
TTGCCGAAAG 
CGCCAAATAC 
AGCAGCGTTA 
AACGGCAAAA 
ACCGTTTGAC 
ATACGTAACT 



AYNNGQEING 
TNLTKTVNEN 
NALNKLGENI 
ETNTKADEAV 
AADKAEAVAA 
LNATTEKLDT 
SGLFQPYNVG 
ERSGHIGLGK 
ASHSDSDEAG 
IYSYDIKGVA 
RYSPELDRSG 
VMHGLGLLST 
VSNIFMAAIP 
NFADAAYAKY 
RHPKTGVPFD 
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51 
101 
151 
201 
251 
301 
351 
401 
451 
501 
551 
601 
651 
701 
751 
801 
851 
901 
951 
1001 
1051 
1101 
1151 
1201 
1251 
1301 
1351 
1401 



ATGAAACACT 
CTGTAGCGGC 
CCACTGTGGC 
TTCAAAGCTG 
CAAAAAAGAC 
TGGGTCTGAA 
AAACAAAACG 
GTTAACAACC 
CCGCTCTGGA 
ACGACATTTG 
ATTAGAAGCC 
ATATCGCCGA 
AAAACCGCCA 
CGATGCCAAA 
CCGCTGGCAC 
AAAGTTACCG 
TAAAAAAGCA 
AATTTGTCAG 
CGCTTGGCTT 
CGGTTTGGAT 
TTGCAGAACA 
GGATCCGGAG 
TGCACTAACC 
CGCTGGATCA 
GGTGCGGAAA 
GAAGAACGAC 
ACGGGCAGCT 
AGCCATTCCG 
GCATTCCGGG 



TTCCATCCAA 
GCACTGGCAG 
CATTGCTGCT 
GAGAGACCAT 
GCAACTGCAG 
AAAAGTCGTG 
TCGATGCCAA 
AAGTTAGCAG 
TGCAACCACC 
CTGAAGAGAC 
GTGGCTGATA 
TTCATTGGAT 
ATGAAGCCAA 
GTAAAAGCTG 
AGCTAATACT 
ACATCAAAGC 
AACAGTGCCG 
AATTGATGGT 
CTGCTGAAAA 
AAAACAGTGT 
AGCCGCGCTC 
GGGGTGGTGT 
GCACCGCTCG 
GTCCGTCAGG 
AAACTTATGG 
AAGGTCAGCC 
CATTACCTTG 
CCTTAACCGC 
AAGATGGTTG 



AGTACTGACC 
CCACAAACGA 
GCCTACAACA 
CTACGACATT 
CCGATGTTGA 
ACTAACCTGA 
AGTAAAAGCT 
ACACTGATGC 
AACGCCTTGA 
TAAGACAAAT 
CCGTCGACAA 
GAAACCAACA 
ACAGACGGCC 
CAGAAACTGC 
GCAGCCGACA 
TGATATCGCT 
ACGTGTACAC 
CTGAACGCTA 
ATCCATTGCC 
CAGACCTGCG 
TCCGGTCTGT 
CGCCGCCGAC 
ACCATAAAGA 
AAAAACGAGA 
AAACGGTGAC 
GTTTCGACTT 
GAGAGTGGAG 
CTTTCAGACC 
CGAAACGCCA 



ACAGCCATCC 
CGACGATGTT 
ATGGCCAAGA 
GATGAAGACG 
AGCCGACGAC 
CCAAAACCGT 
GCAGAATCTG 
CGCTTTAGCA 
ATAAATTGGG 
ATCGTAAAAA 
GCATGCCGAA 
CTAAGGCAGA 
GAAGAAACCA 
AGCAGGCAAA 
AGGCCGAAGC 
ACGAACAAAG 
CAGAGAAGAG 
CTACCGAAAA 
GATCACGATA 
CAAAGAAACC 
TCCAACCTTA 
ATCGGTGCGG 
CAAAGGTTTG 
AACTGAAGCT 
AGCCTCAATA 
TATCCGCGAA 
AGTTCCAAGT 
GAGCAAATAC 
GTTCAGAATC 



TTGCCACTTT 
AAAAAAGCTG 
AATCAACGGT 
GCACAATTAC 
TTTAAAGGTC 
CAATGAAAAC 
AAATAGAAAA 
GATACTGATG 
AGAAAATATA 
TTGATGAAAA 
GCATTCAACG 
CGAAGCCGTC 
AACAAAACGT 
GCCGAAGCTG 
TGTCGCTGCA 
ATAATATTGC 
TCTGACAGCA 
ATTGGACACA 
CTCGCCTGAA 
CGCCAAGGCC 
CAACGTGGGT 
GGCTTGCCGA 
CAGTCTTTGA 
GGCGGCACAA 
CGGGCAAATT 
ATCGAAGTGG 
ATACAAACAA 
AAGATTCGGA 
GGCGACATAG 
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CGGGCGAACA 
TATCGCGGGA 
CACCATAGAT 
AATCGCCAGA 
GGAAAACGCC 
GAAAGGCAGT 
CCGGCAGCGC 
GCCGCCAAGC 

MKHFPSKVLT 
FKAGETIYDI 
KQNVDAKVKA 
TTFAEETKTN 
KTANEAKQTA 
KVTDIKADIA 
RLASAEKSIA 
GSGGGGVAAD 
GAEKTYGNGD 
SHSALTAFQT 
YRGTAFGSDD 
GKRHAVISGS 
AAKQLEHHHH 



TACATCTTTT 
CGGCGTTCGG 
TTCGCCGCCA 
ACTCAATGTC 
ATGCCGTCAT 
TACTCCCTCG 
GGAAGTGAAA 
AACTCGAGCA 

TAILATFCSG 
DEDGTITKKD 
AESEIEKLTT 
IVKIDEKLEA 
EETKQNVDAK 
TNKDNIAKKA 
DHDTRLNGLD 
IGAGLADALT 
SLNTGKLKND 
EQIQDSEHSG 
AGGKLTYTID 
VLYNQAEKGS 
HH* 



GACAAGCTTC 
TTCAGACGAT 
AGCAGGGAAA 
GACCTGGCCG 
CAGCGGTTCC 
GTATCTTTGG 
ACCGTAAACG 
CCACCACCAC 

ALAATNDDDV 
ATAADVEADD 
KLADTDAALA 
VADTVDKHAE 
VKAAETAAGK 
NSADVYTREE 
KTVSDLRKET 
APLDHKDKGL 
KVSRFDFIRQ 
KMVAKRQFRI 
FAAKQGNGKI 
YSLGIFGGKA 



CCGAAGGCGG 
GCCGGCGGAA 
CGGCAAAATC 
CCGCCGATAT 
GTCCTTTACA 
CGGAAAAGCC 
GCATACGCCA 
CACCACTGA 

KKAATVAIAA 
FKGLGLKKW 
DTDAALDATT 
AFNDIADSLD 
AEAAAGTANT 
SDSKFVRIDG 
RQGLAEQAAL 
QSLTLDQSVR 
IEVDGQLITL 
GDIAGEHTSF 
EHLKSPELNV 
QEVAGSAEVK 



CAGGGCGACA 
AACTGACCTA 
GAACATTTGA 
CAAGCCGGAT 
ACCAAGCCGA 
CAGGAAGTTG 
TATCGGCCTT 



AYNNGQEING 
TNLTKTVNEN 
NALNKLGENI 
ETNTKADEAV 
AADKAEAVAA 
LNATTEKLDT 
SGLFQPYNVG 
KNEKLKLAAQ 
ESGEFQVYKQ 
DKLPEGGRAT 
DLAAADIKPD 
TVNGIRHIGL 
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1 
51 
101 
151 
201 
251 
301 
351 
401 
451 
501 
551 
601 
651 
701 
751 
801 
851 
901 
951 
1001 
1051 
1101 
1151 
1201 
1251 
1301 
1351 
1401 
1451 
1501 
1551 
1601 
1651 
1701 
1751 
1801 
1851 
1901 
1951 
2001 
2051 



ATGAAACACT 
CTGTAGCGGC 
CCACTGTGGC 
TTCAAAGCTG 
CAAAAAAGAC 
TGGGTCTGAA 
AAACAAAACG 
GTTAACAACC 
CCGCTCTGGA 
ACGACATTTG 
ATTAGAAGCC 
ATATCGCCGA 
AAAACCGCCA 
CGATGCCAAA 
CCGCTGGCAC 
AAAGTTACCG 
TAAAAAAGCA 
AATTTGTCAG 
CGCTTGGCTT 
CGGTTTGGAT 
TTGCAGAACA 
GGATCCGGCG 
CGGTATCGGC 
CTTACGCCGG 
GCCGGTCGGG 
CCCCCCCCCG 
ACAAGAATTT 
CGCGGGGTAG 
ATCCTTTCCC 
ACAAAAACTA 
GGTAAAGACA 
AGCAAAGCCG 
TGGTCTCCCA 
GGTATTGCGC 
CAAGAACGAA 
GCGAACGTGG 
GCAGGCACTG 
CCAAGCGTTG 
GCCTGATGCA 
AAAAACATGC 
CAACACATAT 
TTATCACAGT 



TTCCATCCAA 
GCACTGGCAG 
CATTGCTGCT 
GAGAGACCAT 
GCAACTGCAG 
AAAAGTCGTG 
TCGATGCCAA 
AAGTTAGCAG 
TGCAACCACC 
CTGAAGAGAC 
GTGGCTGATA 
TTCATTGGAT 
ATGAAGCCAA 
GTAAAAGCTG 
AGCTAATACT 
ACATCAAAGC 
AACAGTGCCG 
AATTGATGGT 
CTGCTGAAAA 
AAAACAGTGT 
AGCCGCGCTC 
GAGGCGGCAC 
AGCAACAGCA 
TATCAAGAAC 
ATGACGTTGC 
AATCTGCATA 
GATCAACCTC 
AGGTAGGTAT 
GAACTGTATG 
TACGGCGTAT 
TTGAAGCTTC 
ACGGATATCC 
TATTATTGGC 
CCGATGCGAC 
ATGATGGTTG 
CGTGCGCATC 
CCGACCTTTT 
CTCGACTATT 
ACAGAGCGAT 
TTTTCATCTT 
GCCCTATTGC 
CGCAGGCGTA 



AGTACTGACC 
CCACAAACGA 
GCCTACAACA 
CTACGACATT 
CCGATGTTGA 
ACTAACCTGA 
AGTAAAAGCT 
ACACTGATGC 
AACGCCTTGA 
TAAGACAAAT 
CCGTCGACAA 
GAAACCAACA 
ACAGACGGCC 
CAGAAACTGC 
GCAGCCGACA 
TGATATCGCT 
ACGTGTACAC 
CTGAACGCTA 
ATCCATTGCC 
CAGACCTGCG 
TCCGGTCTGT 
TTCTGCGCCC 
GAGCAACAAC 
GAAATGTGCA 
GGTTACAGAC 
CCGGAGACTT 
AAACCTGCAA 
CGTCGACACA 
GCAGAAAAGA 
ATGCGGAAGG 
TTTCGACGAT 
GCCACGTAAA 
GGGCGTTCCG 
GCTACACATA 
CAGCCATCCG 
GTCAATAACA 
CCAAATAGCC 
CCGGCGGTGA 
TACGGCAACC 
TTCGACAGGC 
CATTTTATGA 
GACCGCAGTG 



ACAGCCATCC 
CGACGATGTT 
ATGGCCAAGA 
GATGAAGACG 
AGCCGACGAC 
CCAAAACCGT 
GCAGAATCTG 
CGCTTTAGCA 
ATAAATTGGG 
ATCGTAAAAA 
GCATGCCGAA 
CTAAGGCAGA 
GAAGAAACCA 
AGCAGGCAAA 
AGGCCGAAGC 
ACGAACAAAG 
CAGAGAAGAG 
CTACCGAAAA 
GATCACGATA 
CAAAGAAACC 
TCCAACCTTA 
GACTTCAATG 
AGCGAAATCA 
AAGACAGAAG 
AGGGATGCCA 
TCCAAACCCA 
TTGAAGCAGG 
GGCGAATCCG 
ACACGGCTAT 
AAGCGCCTGA 
GAGGCCGTTA 
AGAAATCGGA 
TGGACGGCAG 
ATGAATACGA 
CAATGCATGG 
GTTTTGGAAC 
AATTCGGAGG 
TAAAACAGAC 
TGTCCTACCA 
AATGACGCAC 
AAAAGACGCT 
GAGAAAAGTT 



TTGCCACTTT 
AAAAAAGCTG 
AATCAACGGT 
GCACAATTAC 
TTTAAAGGTC 
CAATGAAAAC 
AAATAGAAAA 
GATACTGATG 
AGAAAATATA 
TTGATGAAAA 
GCATTCAACG 
CGAAGCCGTC 
AACAAAACGT 
GCCGAAGCTG 
TGTCGCTGCA 
ATAATATTGC 
TCTGACAGCA 
ATTGGACACA 
CTCGCCTGAA 
CGCCAAGGCC 
CAACGTGGGT 
CAGGCGGTAC 
GCAGCAGTAT 
CATGCTCTGT 
AAATCAATGC 
AATGACGCAT 
CTATACAGGA 
TCGGCAGCAT 
AACGAAAATT 
AGACGGAGGC 
TAGAGACTGA 
CACATCGATT 
ACCTGCAGGC 
ATGATGAAAC 
GTCAAGCTGG 
AACATCGAGG 
AGCAGTACCG 
GAGGGTATCC 
CATCCGTAAT 
AAGCTCAGCC 
CAAAAAGGCA 
CAAACGGGAA 
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10 



15 



20 



25 



30 



35 



40 



45 



50 



55 



60 



65 



2101 
2151 
2201 
2251 
2301 
2351 
2401 
2451 
2501 
2551 
2601 
2651 
2701 
2751 
2801 
2851 
2901 
2951 
3001 
3051 
3101 
3151 
3201 
3251 
3301 
3351 
3401 
3451 
3501 
3551 
3601 
3651 
3701 
3751 
3801 
3851 
3901 
3951 
4001 
4051 
4101 
4151 
4201 

• 1 
51 
101 
151 
201 
251 
301 
351 
401 
451 
501 
551 
601 
651 
701 
751 
801 
851 
901 
951 
1001 
1051 
1101 



ATGTATGGAG 
CGGAATTACT 
GTTTCACCCG 
CCCATCGTAA 
GAGCAACGAC 
GTGCAGTCGG 
AAGGCCATGA 
TACGAAAGGT 
GCACGGGCGG 
AACAACACCT 
GTACGGCAAC 
TTTATAACGG 
TATCTGGCAG 
AGGCAGTCTG 
AACTGCTGAA 
TCGGCACGCG 
TCCCTTCCTG 
ACATCGAAAC 
ACAGCGGGCA 
TGCGGCACGG 
AACACGCCGT 
CTGGATGCCT 
AGCCGACCGC 
GCGCAGCGGC 
TTCAACAGTC 
CGATATGCAG 
ACGGCACGGG 
TGGGAACAGG 
CGGCATTGCC 
GCATGGGACG 
AGCATTAGTC 
TCTCAAAGGC 
GCACCGGTGC 
CAGCTGGGCG 
TTTGACGGTC 
TCGCCGAAAA 
GGCACGCTGG 
TAAAGCCGTC 
GCGACTACAC 
AAGACGGGGG 
CGCGGATGTC 
ACGCCGGTTC 
TACCGGTTCT 

MKHFPSKVLT 
FKAGETIYDI 
KQNVDAKVKA 
TTFAEETKTN 
KTANEAKQTA 
KVTDIKADIA 
RLASAEKSIA 
GSGGGGTSAP 
AGRDDVAVTD 
RGVEVGIVDT 
GKDIEASFDD 
GIAPDATLHI 
AGTADLFQIA 
KNMLFIFSTG 
MYGEPGTEPL 
PIVTGTAALL 
KAMNGPASFP 
NNTYTGKTII 
YLADTDQSGA 
SARGKGAGYL 
TAGSEGDTLS 
LDASESSATP 
FNSLAATVYA 



AACCGGGTAC 
GCCATGTGGT 
TACAAACCCG 
CCGGCACGGC 
AACCTGCGTA 
CGTGGACAGC 
ACGGACCCGC 
ACATCCGATA 
CCTGATCAAA 
ATACGGGCAA 
AACAAATCGG 
GGCGGCATCC 
ATACCGACCA 
CAGCTGGACG 
AGTGGACGGT 
GGAAGGGGGC 
AGTGCCGCCA 
CGACGGCGGC 
GTGAAGGCGA 
ACTGCTTCGG 
AGAACAGGGC 
CCGAATCATC 
ACAGATATGC 
AGCCGTACAG 
TCGCCGCTAC 
GGACGCCGCC 
TCTGCGCGTC 
GCGGTGTTGA 
GCGAAAACCG 
CAGCACATGG 
TGTTTGCAGG 
CTGTTCTCCT 
GGACGAACAT 
CACTGGGCGG 
GAAGGCGGTC 
AGGCAGTGCT 
TCGGACTCGC 
CTGTTTGCAA 
GGTAACGGGC 
CACGCAATAT 
GAATTCGGCA 
CAAACAGTAC 
GACTCGAG 

TAILATFCSG 
DEDGTITKKD 
AESEIEKLTT 
IVKIDEKLEA 
EETKQNVDAK 
TNKDNIAKKA 
DHDTRLNGLD 
DFNAGGTGIG 
RDAKINAPPP 
GESVGSISFP 
EAVIETEAKP 
MNTNDETKNE 
NSEEQYRQAL 
NDAQAQPNTY 
EYGSNHCGIT 
LQKYPWMSND 
FGDFTADTKG 
EGGSLVLYGN 
NETVHIKGSL 
NSTGRRVPFL 
YYVRRGNAAR 
ETVETAAADR 
DSTAAHADMQ 



AGAACCGCTT 
GCCTGTCGGC 
ATTCAAATTG 
GGCTCTGCTG 
CCACGTTGCT 
AAGTTCGGCT 
GTCCTTTCCG 
TTGCCTACTC 
AAAGGCGGCA 
AACCATTATC 
ATATGCGCGT 
GGCGGCAGCC 
ATCCGGCGCA 
GCAAAGGTAC 
ACGGCGATTA 
AGGCTATCTC 
AAATCGGGCA 
CTGCTGGCTT 
CACGCTGTCC 
CAGCGGCACA 
GGCAGCAATC 
CGCAACACCC 
CGGGCATCCG 
CATGCGAATG 
CGTCTATGCC 
TGAAAGCCGT 
ATCGCGCAAA 
AGGCAAAATG 
GCGAAAATAC 
AGCGAAAACA 
CATACGGCAC 
ACGGACGCTA 
GCGGAAGGCA 
TGTCAACGTT 
TGCGCTACGA 
TTGGGCTGGA 
GGGTCTGAAG 
CGGCGGGCGT 
GGCTTTACCG 
GCCGCACACC 
ACGGCTGGAA 
GGCAACCACA 



ALAATNDDDV 
ATAADVEADD 
KLADTDAALA 
VADTVDKHAE 
VKAAETAAGK 
NSADVYTREE 
KTVSDLRKET 
SNSFATTAKS 
NLHTGDFPNP 
ELYGRKEHGY 
TDIRHVKEIG 
MMVAAIRNAW 
IiDYSGGDKTD 
ALLPFYEKDA 
AMWCLSAPYE 
NLRTTLLTTA 
TSDIAYSFRN 
NKSDMRVETK 
QLDGKGTLYT 
SAAKIGQDYS 
TASAAAHSAP 
TDMPGIRPYG 
GRRLKAVSDG 



GAGTATGGCT 
ACCCTATGAA 
CCGGAACATC 
CTGCAGAAAT 
GACGACGGCT 
GGGGACTGCT 
TTCGGCGACT 
CTTCCGTAAC 
GCCAACTGCA 
GAAGGCGGTT 
CGAAACCAAA 
TGAACAGCGA 
AACGAAACCG 
GCTGTACACA 
TCGGCGGCAA 
AACAGTACCG 
GGATTATTCT 
CCCTCGACAG 
TATTATGTCC 
TTCCGCGCCC 
TGGAAAACCT 
GAGACGGTTG 
CCCCTACGGC 
CCGCCGACGG 
GACAGTACCG 
ATCGGACGGG 
CCCAACAGGA 
CGCGGCAGTA 
GACAGCAGCC 
GTGCAAATGC 
GATGCGGGCG 
CAAAAACAGC 
GCGTCAACGG 
CCGTTTGCCG 
CCTGCTCAAA 
GCGGCAACAG 
CTGTCGCAAC 
GGAACGCGAC 
GCGCGACTGC 
CGTCTGGTTG 
CGGCTTGGCA 
GCGGACGAGT 



KKAATVAIAA 
FKGLGLKKW 
DTDAALDATT 
AFNDIADSLD 
AEAAAGTANT 
SDSKFVRIDG 
RQGLAEQAAL 
AAVSYAGIKN 
NDAYKNLINL 
NENYKNYTAY 
HIDLVSHIIG 
VKLGERGVRI 
EGIRLMQQSD 
QKGIITVAGV 
ASVRFTRTNP 
QDIGAVGVDS 
DISGTGGLIK 
GALIYNGAAS 
RLGKLLKVDG 
FFTNIETDGG 
AGLKHAVEQG 
ATFRAAAAVQ 
LDHNGTGLRV 



CCAACCATTG 
GCAAGCGTCC 
CTTTTCCGCA 
ACCCGTGGAT 
CAGGACATCG 
GGATGCGGGT 
TTACCGCCGA 
GACATTTCAG 
ACTGCACGGC 
CGCTGGTGTT 
GGTGCGCTGA 
CGGCATTGTC 
TACACATCAA 
CGTTTGGGCA 
GCTGTACATG 
GACGACGTGT 
TTCTTCACAA 
CGTCGAAAAA 
GTCGCGGCAA 
GCCGGTCTGA 
GATGGTCGAA 
AAACTGCGGC 
GCAACTTTCC 
TGTACGCATC 
CCGCCCATGC 
TTGGACCACA 
CGGTGGAACG 
CCCAAACCGT 
GCCACACTGG 
AAAAACCGAC 
ATATCGGCTA 
ATCAGCCGCA 
CACGCTGATG 
CAACGGGAGA 
CAGGATGCAT 
CCTCACTGAA 
CCTTGAGCGA 
CTGAACGGAC 
AGCAACCGGC 
CCGGCCTGGG 
CGTTACAGCT 
CGGCGTAGGC 



AYNNGQEING 
TNLTKTVNEN 
NALNKLGENI 
ETNTKADEAV 
AADKAEAVAA 
LNATTEKLDT 
SGLFQPYNVG 
EMCKDRSMLC 
KPAIEAGYTG 
MRKEAPEDGG 
GRSVDGRPAG 
VNNSFGTTSR 
YGNLSYHIRN 
DRSGEKFKRE 
IQIAGTSFSA 
KFGWGLLDAG 
KGGSQLQLHG 
GGSLNSDGIV 
TAIIGGKLYM 
LLASLDSVEK 
GSNLENLMVE 
HANAADGVRI 
IAQTQQDGGT 
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1151 WEQGGVEGKM RGSTQTVGIA AKTGENTTAA ATLGMGRSTW SENSANAKTD 
1201 SISLFAGIRH DAGDIGYLKG LFSYGRYKNS ISRSTGADEH AEGSVNGTLM 
1251 QLGALGGVNV PFAATGDLTV EGGLRYDLLK QDAFAEKGSA LGWSGNSLTE 
1301 GTLVGLAGLK LSQPLSDKAV LFATAGVERD LNGRDYTVTG GFTGATAATG 
1351 KTGARNMPHT RLVAGLGADV EFGNGWNGLA RYSYAGSKQY GNHSGRVGVG 
1401 YRF* 

It will be understood that the invention has been described by way of example only and 
modifications may be made whilst remaining within the scope and spirit of the invention. For 
instance, the use of proteins from other strains is envisaged [e.g. see WO00/66741 for 
polymorphic sequences for ORF4, ORF40, ORF46, 225, 235, 287, 519, 726, 919 and 953]. 



EXPERIMENTAL DETAILS 



FPLC protein purification 

The following table summarises the FPLC protein purification that was used: 



Protein 


PI 


Column 


Buffer 


pH 


Protocol 


121.1 untagged 


6.23 


MonoQ 


Tris 


8.0 


A 


12 g ^untagged 


5.04 


MonoQ 


Bis-Tris propane 


6.5 


A 


406. 1L 


7.75 


MonoQ 


Diethanolamine 


9.0 


B 


576. 1L 


5.63 


MonoQ 


Tris 


7.5 


B 


^(jguntagged 


8.79 


Mono S 


Hepes 


7.4 


A 


726 unta sg«J 


4.95 


Hi-trapS 


Bis-Tris 


6.0 


A 


9 ^untagged 


10.5(-leader) 


Mono S 


Bicine 


8.5 


C 


919Lorf4 


10.4(-leader) 


Mono S 


Tris 


8.0 


B 


920L 


6.92(-leader) 


Mono Q 


Diethanolamine 


8.5 


A 


953L 


7.56(-leader) 


MonoS 


MES 


6.6 


D 




4.73 


Mono Q 


Bis-Tris propane 


6.5 


A 


919-287 


6.58 


Hi-trap Q 


Tris 


8.0 


A 


953-287 


4.92 


MonoQ 


Bis-Tris propane 


6.2 


A 



Buffer solutions included 20-120 mM NaCl, 5.0 mg/ml CHAPS and 10% v/v glycerol. The 
dialysate was centrifuged at 13000g for 20 min and applied to either a mono Q or mono S 
FPLC ion-exchange resin. Buffer and ion exchange resins were chosen according to the pi of 
the protein of interest and the recommendations of the FPLC protocol manual [Pharmacia: 
FPLC Ion Exchange and Chromatofocussing; Principles and Methods. Pharmacia 



WO 01/64922 



PCT/IB01/00452 



-78- 

Publication]. Proteins were eluted using a step-wise NaCl gradient. Purification was 
analysed by SDS-PAGE and protein concentration determined by the Bradford method. 

The letter in the 'protocol' column refers to the following: 

FPLC-A: Clones 121.1, 128.1, 593, 726, 982, periplasmic protein 920L and hybrid proteins 
5 919-287, 953-287 were purified from the soluble fraction of Kcoli obtained after disruption 
of the cells. Single colonies harbouring the plasmid of interest were grown overnight at 37°C 
in 20 ml of LB/Amp (100 jig/ml) liquid culture. Bacteria were diluted 1:30 in 1.0 L of fresh 
medium and grown at either 30°C or 37°C until the OD 55 o reached 0.6-08. Expression of 
recombinant protein was induced with IPTG at a final concentration of 1.0 mM. After 

10 incubation for 3 hours, bacteria were harvested by centrifugation at 8000g for 15 minutes at 
4°C. When necessary cells were stored at -20°C. All subsequent procedures were performed 
on ice or at 4°C. For cytosolic proteins (121.1, 128.1, 593, 726 and 982) and periplasmic 
protein 920L, bacteria were resuspended in 25 ml of PBS containing complete protease 
inhibitor (Boehringer-Mannheim). Cells were lysed by by sonication using a Branson 

15 Sonifier 450. Disrupted cells were centrifuged at 8000g for 30 min to sediment unbroken - 
cells and inclusion bodies and the supernatant taken to 35% v/v saturation by the addition of 
3.9 M (NH4)2S04. The precipitate was sedimented at 8000g for 30 minutes. The supernatant 
was taken to 70% v/v saturation by the addition of 3.9 M (NRO2SO4 and the precipitate 
collected as above. Pellets containing the protein of interest were identified by SDS-PAGE w 

20 and dialysed against the appropriate ion-exchange buffer (see below) for 6 hours or ■? 
overnight. The periplasmic fraction from Kcoli expressing 953L was prepared according to ; * 
the protocol of Evans et. al [InfecUmmun. (1974) 10:1010-1017] and dialysed against the 
appropriate ion-exchange buffer. Buffer and ion exchange resin were chosen according to 
the pi of the protein of interest and the recommendations of the FPLC protocol manual 

25 (Pharmacia). Buffer solutions included 20 mM NaCl, and 10% (v/v) glycerol. The dialysate 
was centrifuged at 13000g for 20 min and applied to either a mono Q or mono S FPLC ion- 
exchange resin. Buffer and ion exchange resin were chosen according to the pi of the protein 
of interest and the recommendations of the FPLC protocol manual (Pharmacia). Proteins 
were eluted from the ion-exchange resin using either step-wise or continuous NaCl 

30 gradients. Purification was analysed by SDS-PAGE and protein concentration determined by 
Bradford method. Cleavage of the leader peptide of periplasmic proteins was demonstrated 
by sequencing the NH2-terminus (see below). 
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FPLC-B: These proteins were purified from the membrane fraction of Rcoli. Single 
colonies harbouring the plasmid of interest were grown overnight at 37°C in 20 ml of 
LB/Amp (100 |ig/ml) liquid culture. Bacteria were diluted 1:30 in 1.0 L of fresh medium. 
Clones 406.1L and 919LOrf4 were grown at 30°C and Orf25L and 576.1L at 37°C until the 
5 OD 550 reached 0.6-0.8. In the case of 919LCM4, growth at 30°C was essential since 
expression of recombinant protein at 37°C resulted in lysis of the cells. Expression of 
recombinant protein was induced with IPTG at a final concentration of 1.0 mM. After 
incubation for 3 hours, bacteria were harvested by centrifugation at 8G00g for 15 minutes at 
4°C. When necessary cells were stored at -20 °C All subsequent procedures were performed 

10 at 4°C. Bacteria were resuspended in 25 ml of PBS containing complete protease inhibitor 
(Boehringer-Mannheim) and lysed by osmotic shock with 2-3 passages through a French 
Press. Unbroken cells were removed by centrifugation at 5000g for 15 min and membranes 
precipitated by centrifugation at lOOOOOg (Beckman Ti50, 38000rpm) for 45 minutes. A 
Dounce homogenizer was used to re-suspend the membrane pellet in 7.5 ml of 20 mM Tris- 

15 HC1 (pH 8.0), 1.0 M NaCl and complete protease inhibitor. The suspension was mixed for 2- 
4 hours, centrifuged at lOOOOOg for 45 min and the pellet resuspended in 7.5 ml of 20mM 
Tris-HCl (pH 8.0), 1.0M NaCl, 5.0mg/ml CHAPS, 10% (v/v) glycerol and complete protease 
inhibitor. The solution was mixed overnight, centrifuged at lOOOOOg for 45 minutes and the 
supernatant dialysed for 6 hours against an appropriately selected buffer. In the case of 

20 Orf25.L, the pellet obtained after CHAPS extraction was found to contain the recombinant 
protein. This fraction, without further purification, was used to immunise mice. 

FPLC-C: Identical to FPLC-A, but purification was from the soluble fraction obtained after 
permeabilising E.coli with polymyxin B, rather than after cell disruption. 

FPLC-D: A single colony harbouring the plasmid of interest was grown overnight at 37°C 
25 in 20 ml of LB/Amp (100 fig/ml) liquid culture. Bacteria were diluted 1:30 in 1.0 L of fresh 
medium and grown at 30°C until the OD 55 o reached 0.6-0.8. Expression of recombinant 
protein was induced with IPTG at a final concentration of l.OmM. After incubation for 3 
hours, bacteria were harvested by centrifugation at 8000g for 15. minutes at 4°C When 
necessary cells were stored at -20 °C. All subsequent procedures were performed on ice or at 
30 4°C. Cells were resuspended in 20mM Bicine (pH 8.5), 20mM NaCl, 10% (v/v) glycerol, 
complete protease inhibitor (Boehringer-Mannheim) and disrupted using a Branson Sonifier 
450. The sonicate was centrifuged at 8000g for 30 min to sediment unbroken cells and 
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inclusion bodies. The recombinant protein was precipitated from solution between 35% v/v 
and 70% v/v saturation by the addition of 3.9M (NH 4 ) 2 S0 4 . The precipitate was sedimented 
at 8000g for 30 minutes, resuspended in 20 mM Bicine (pH 8.5), 20 mM NaCl, 10% (v/v) 
glycerol and dialysed against this buffer for 6 hours or overnight. The dialysate was 
5 centrifuged at 13000g for 20 min and applied to the FPLC resin. The protein was eluted from 
the column using a step-wise NaCl gradients. Purification was analysed by SDS-PAGE and 
protein concentration determined by Bradford method. 

Cloning strategy and oligonucleotide design 

Genes coding for antigens of interest were amplified by PCR, using oligonucleotides 
10 designed on the basis of the genomic sequence of N. meningitidis B MC58. Genomic DNA 
from strain 2996 was always used as a template in PCR reactions, unless otherwise specified, 
and the amplified fragments were cloned in the expression vector pET21b+ (Novagen) to 
express the protein as C-terminal His-tagged product, or in pET-24b+(Novagen) to express 
the protein in 'untagged' form (e.g. AG 287K). 

15 Where a protein was expressed without a fusion partner and with its own leader peptide (if 
present), amplification of the open reading frame (ATG to STOP codons) was performed! 

Where a protein was expressed in 'untagged' form, the leader peptide was omitted by 
designing the 5-end amplification primer downstream from the predicted leader sequence. 

The melting temperature of the primers used in PCR depended on the number and type of 
20 hybridising nucleotides in the whole primer, and was determined using the formulae: 
T m i = 4 (G+C)+ 2 (A+T) (tail excluded) 

Tna = 64.9 + 0.41 (% GC) - 600/N (whole primer) 

The melting temperatures of the selected oligonucleotides were usually 65-70°C for the 
whole oligo and 50-60°C for the hybridising region alone. 

25 Oligonucleotides were synthesised using a Perkin Elmer 394 DNA/RNA Synthesizer, eluted 
from the columns in 2.0ml NH4OH, and deprotected by 5 hours incubation at 56°C. The 
oligos were precipitated by addition of 0.3M Na-Acetate and 2 volumes ethanol. The 
samples were centrifuged and the pellets resuspended in water. 
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Sequences 


Restriction 
site 


OrflL 


Fwd 


CGCGGATCCQCTAQC-AAAACAACCGACAAACGG 


Nhel 


Rev 


CCCGCTCGAQ-TTACCAGCGGTAGCCTA 


Xhol 


Orfl 


Fwd 


CTAGCTAGC-GGACACACTTATTTCGGCATC 


Nhel 


Rev 


CCCGCTCGAG- TTACCAGCGGTAGCCTAATTTG 


Xhol 


OrflLOmpA 


Fwd 




NdeI-(NheI) 


Rev 


CCCGCTCGAG- 


Xhol 


Orf4L 


Fwd 


CGCGGATCCCAIAlQ-AAAACCrTCTrCAAAACC 


Ndel 


Rev 


CCCGeESSAS-TTATTrGGCTGCGCCTTC 


Xhol 


OrfML 


Fwd 
Rev 


GCGGCAJIMI-ATGTTGAGAAAATrGTTGAAATGG 
GCGGCCTCGAG-TTATTTTTTCAAAATATATTTflr. 


Asel 1 
Xhol 


Orf9-lL 


Fwd 
Rev 


GCGGCCATATG-TTACCTAACCGTTTCAAAATGT 

GCGGCCTCGAG-TTATTTCCGAGGTT1TCGGG 

CGCGGATCCCATATG-ACACGCTTCAAATATTC 


Ndel 
Xhol 


Orf23L 


Fwd 


Ndel 


Rev 


CCCGCTCQAfi-TTATTTAAACCGATAGGTAAA 
CGCGGATCCCATATG-GGCAGGGAAGAACCGC 


Xhol 

NMeT 


Orf25-l His 


Fwd 


Rev 


GCCCAAGCTT-ATCGATGGAATAGCCGCG 


HinHTTT 
ruin i 1 1 1 


Or£29«l b-His 
(MC58) 


Fwd 


CGCGGATCCGCTAGC-AACGGTTTGGATGCCCG 


NheT 


Rev 


CCCGCTCGAG-TTTGTCTAAGTTCCTGATAT 
CCCGCTCGAG-ATTCCCACCTGCCATC 




Orf29-l b-L 
(MC58) 


Fwd 


CGCGGATCCGCIAS£-ATGAATTrGCCTATTCAAAAAT 


Nhel 


Rev 


CCCGCTCGAG-TTAATTCCCACCTGCCATC 


Xhol 


Orf29-l c-His 
(MC58) 


Fwd 


CGCGGATCCGCEAQC-ATGAATTTGCCTATTCAAAAAT 


Nhel 


Rev 


CCCG£JCG^Q-TTGGACGATGCCCGCGA 


Xhol 


Or£29-l c-L 
(MC58) 
Or£25L 

Orf37L 


Fwd 


CGCGGATCCGCTAGC-ATGAATTTGCCTATTCAAAAAT 


Nhel 


Rev 


CCCGCTCGAG-TTATTGGACGATGCCCGC 


Xhol 


Fwd 


CGCGGATCCCATATG-TATCGCAAACTGATTGC 


Ndel 


IS.CV 


ccrTiCTcnAn cr a a ttp, a tcxc, a a t a ncc 


AnOx 


Fwd 


CGCGGATCCCAJATG-AAACAGACAGTCAAATG 


Ndel 


Rev 


CCCGCTCGAG-TCAATAACCCGCCTTCAG 


Xhol 


Orf38L 


Fwd 


CGCGGATCCCAEATG- 
TTACGTTTGACTGCTTTAGCCGTATGCACC 


Ndel 


Dai; 

Kev 


fYwvTYYi a n 
V-LCUC1CUAU- 

TTATTTTGCCGCGTTAAAAGCGTCGGCAAC 


XJlOl 


Orf40L 


Fwd 


CGCGGATCCCATATG-AACAAAATATACCGCAT 


Ndel 


Rev 


CCCGCTCGAG-TTACCACTGATAACCGAC 


Xhol 


Orf40.2-His 


Fwd 


CGCGGATCC£ATATQ-ACCGATGACGACGATTTAT 


Ndel 


Rev 


GCCCAAGCTT-CCACTGATAACCGACAGA 


Hindm 


Orf40.2L 


Fwd 


CGCGGATCCCATATG-AACAAAATATACCGCAT 


Ndel 


Rev 


GCCCAAQCTT-TTACCACTGATAACCGAC 


Hindm 


Orf46-2L 


Fwd 


GGGAATTCCATAIG-GGCATTTCCCGCAAAATATC 


Ndel 


Rev 


CCCGCTCGAG-TTATTTACTCCTATAACGAGGTCTCTTAAC 


Xhol 


Orf46-2 


Fwd 


GGGAATTCCATATG-TCAGATTTGGCAAACGATTCTT 


Ndel 


Rev 


CCCGeTCG^-TTATTTACTCCTATAACGAGGTCTCTTAAC 


Xhol 


Orf46.1L 


Fwd 


GGGAATTCCATATG-GGCATTTCCCGCAAAATATC 


Ndel 



WO 01/64922 



-82- 



PCT/IB01/00452 





Rev 


CCCGOEGAG-TTACGTATCATATTTCACGTGC 


Xhol 


onto* ^tiis-vxa 1 ) 


rwo 


UUGAAl I i-CA I A I GCACGTG A A AT ATO AT ACG AAG 


BamHI-Ndel 




KcV 


cxcCjCicuaui 1 1 ACTCCTATAACGAGijlUlUi TAAC 


Xhol 


onw.l-tiis 


Fwd 


GGGAATTCCATATGTCAGATTTGGCAAACGATTCTT 


Ndel 




Rev 


CCCGCTCG AGCGTATC AT ATTTC A CfiTGC" 


Xhol 


orf46.2-His 


Fwd 


GGGAATTCCAT^T£TCAGATTTGGCAAACGATTCTT 


Ndel 




Rev 


CCCGCTCGAGTTTACTCCTATAACGAGGTCTCTTAAC 


Xhol 


Orf65-l-(His/GST) 


Fwd 


CGCGG ATCCC ATATG-CA A A ATfTOTTrr A A A A TCCC 


BamHI-Ndel 


(MCS8) 


Rev 


CGCGGATCCCATAIQ-AACAAAATATACXXjCAT 
CCCGCTCGAG -TTTGCTTTCGATAGAACGG 


Xhol 


Orf72-lL 


Fwd 


gcggcc^jaiq-gtcataaaatatacaaatttgaa 


Ndel 




Rev 


gcggcctcgaq-ttagcctgagacctttgcaaatt 


Xhol 


Orf76-lL 


Fwd 


gcggccatatg-aaacagaaaaaaaccgctg 


Ndel 




Rev 


gcggcct^g^-ttacggtttgacaccgttttc 


Xhol 


Or£83.1L 


Fwd 


CGCGGATCCCATATG-AAAACrrTnTTrrTr 


Ndel 




Rev 


CCCGCT^GAQ-TTATCCTCCTTTGCGGC 


Xhol 


Orf85-2L 


Fwd 


GCGGCCATATG-GCAAAAATGATGAAATGGG 


Ndel 




Rev 


GCGGCCTCGAG-TTATCGGCGrr,GTGGGrr 


Xhol 


Or£91L (MC58) 


Fwd 


GCGGCCATATGAAAAAATCCTCCCTCATCA 


Ndel 




Rev 


GCGGCCTCGAGTTATTTGCCGCCGTTTTTGGC 


Xhol 


Orf91-His(MC58) 


Fwd 


GCGGCCATATGGCCCCTGCCGACGCGGTAAG 


Ndel 




Rev 


GCGGCCTXIGAGTTTGCCGCCGTTTITGGCTTTC 


Xhol - 


Orf97-lL 


Fwd 


GCGGCCATATG-AAACACATACTCCCCCTGA 


Ndel - 




Rev 


GCGGCCTCGAG-TTATTCGCCTACGGTTTTTTG 


Xhol 4 


Orfll9L (MC58) 


Fwd 


GCGGCCATATGATTTACATCGTACTGTTTC 


Ndel rr 




Rev 


GCGGCCTCGAGTTAGGAGAACAGGCGCAATGC 


XhnT t - 


Orfll9-His(MC58) 


Fwd 


GCGGCCATATGTACAACATGTATCAOfi A A A AP 


INuCi =~ 




Rev 


GCGGCCTCGAGGGAGAACAGGCGCAATGCGG 


YVinT 


Orfl37.1 (His- 
GST) (MC58) ! 


Fwd 


CGCGGATCCGCTAGCTGCGGCArGnrnnn 


Damru-iNnei 




Rec 


CCCGCTCG AG ATAACGGTATOCCGCr A C 


Xhol 


Orfl43-lL 


Fwd 


cck:ggatcccaiaiq-gaatcaacactttcac 


Ndel 




Rev 


CCCGCTCGAG-TTACACGCGnTTGrTGT 


Xhol 


008 


Fwd 


CGCGGATCCCAL^-AACAACAGACATTTTG 


Ndel 




Rev 


CCCGCT£QAfi-TTACCTGTCCGGTAAAAG 


Xhol 


050-1(48) 


Fwd 


CGCGGATCC2CTAQC-ACCGTCATCAAACAGGAA 


Nhel 




Rev 


CCCGCTC£A£-TCAAGATTCGACGGGGA 


Xhol 


105 


Fwd 


CGCGGATCCCAIA1Q-TCCGCAAACGAATACG 


Ndel 




Rev 


CCCGCTCGAG-TCAGTGTTCTGCCAGTTT 


Xhol 


111L 


Fwd 


CGCGGATCCCATA1Q-CCGTCTGAAACACG 


Ndel 




Rev 


CCCGCTCGAG-TTAGCGGAGCAGTTTTTC 


Xhol 


117-1 


Fwd 


CGCGGATCCCATATG-ACCGCCATCAGCC 


Ndel 




Rev 


CCCGCTCGAG-TTAAAGCCGGGTAACGC 


Xhol 


121-1 


Fwd 


GCGGCCj^TG-GAAACACAGCTTTACATCGG 


Ndel 




Rev 


GCGGCCTCGAG-TCAATAATAATATCCCGCG 


Xhol 
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122-1 


Fwd 
Rev 


GCGGCCATATC-ATTAAAATCCGCAATATCC 


Ndel 


GCGGCCTCGAG-TTAAATCTTGGTAGATTGGATTTGG 


Xhol 


128-1 


Fwd 
Rev 


GCGGCCATATC-ACTGACAACGCACTGCTCC 


Ndel 


GCGGCdCeAQ-TC AG ACCGCGTTGTCG AAAC 


Xhol 


148 

149.1L (MC58) 


Fwd 


CGCGGATCCCAIATG-GCGTTAAAAACATCAAA 


Ndel 


Rev 
Fwd 
Rev 


CCCG£rCQAQ-TCAGCCCTTCATACAGC 

GCGGCATTAATGGCACAAACTACACTCAAACC 

GCGGCCTCGAGTTAAAACTTCACGTTCACGCCG 


Xhol 
Asel 
Xhol 


149.1-His(MC58) 


Fwd 
Rev 


GCGGCATTAATGCATGAAACTGAGCAATCGGTGG 
CKZGGCCTCGAGAAACTTCACGTTCACGCCGCCGGTAAA 


Xhol 


205 (His-GST) 
(MC58) 


Fwd 


CGCGGATCCCATATGGGCAAATCCGAAAATACG 


BamHI-Ndel 

i/(UAUU. X ^ VIVA 


Rev 


CCCGCTCGAGATAATGGCGGCGGCGG 


Xhol 


206L 


Fwd 


CGCGGATCCCATATG-TTTCCCCCCGACAA 


Ndel 


Rev 


CCCGCTCGAG-TCATTCTGTAAAAAAAGTATG 


Xhol 


214 (His-GST) 
(MC58) 


Fwd 


CGCGGATCCCATATGCTTCAAAGCGACAGCAG 


BamHI-Ndel 


Rev 


CCCGCTCGAGTTCGGATTTTTGCGTACTC 


Xhol 


216 


Fwd 


CGCGGATCCCATATG-GCAATGGCAGAAAACG 


Ndel 


Rev 


CCCGCTCGAG-CTATACAATCCGTGCCG 


Xhol 


225-1L 


Fwd 


CGCGGATCCCATATG-GATTC1TTTTTCAAACC 


Ndel 


Rev 


CCCGCTCGAG-TCAGTTCAGAAAGCGGG 


Xhol 


235L 


Fwd 


CGCGGATCCCATATG-AAACCTTTGATTTTAGG 


Ndel 


Rev 


CCCGCTCG AG-TTATTTG GGCTGCTCTTC 


Xhol 


243 


Fwd 


CGCGGATCCCATATG-GTAATCGTCTGGTTG 


Ndel 


Rev 


CCCGCTCGAG-CTACGACTTGGTTACCG 


Xhol 


247-1L 


Fwd 


GCGGCCATATG-AGACGTAAAATGCTAAAGCTAC 


Ndel 


Rev 


GCGGCCTCGAG-TCAAAGTGTTCTGTTTGCGC 


Xhol 




Fwd 


GCCGCCATATG-TTGACTTTAACCCGAAAAA 


Ndel 


Rev 


GCCGCGTCGAG-GCCGGCGGTCAATACCGCCCGAA 


Xhol 


270 (His-GST) 
(MC58) 


Fwd 


CGCGXjATCCCATATGGCGCAATGCGATTTGAC 


BamHI-Ndel 


Rev 


CCCGerCGAGTTCGGCGGTAAATGCCG 


Xhol 


274L 


Fwd 


GCGGCCATATG-GCGGGGCCGATTTTTGT 


Ndel 


Rev 


GCGG CCTCG AG-TTATTTGCTTTC AGTATT ATTG 


Xhol 


283L 


Fwd 


GCGGCCATATG-AACTTTGCTTTATCCGTCA 


Ndel 


Rev 


GCGGCCTCGAG-TTAACGGCAGTATTTGTTTAC 


Xhol 


285-His 


Fwd 


CGCGGATCCCATATGGGTTTGCGCTTCGGGC 


Bamffl 


Rev 


GCCCAAGCTTTTTTCCTTTGCCGTTTCCG 


Hindm 


286-His 
(MC58) 


Fwd 


CGCGGATCCCAIMQ-GCCGACCTTTCCGAAAA 


Ndel 


Rev 


CCCGCTCGAG-GAAGCGCGTTCCCAAGC 


Xhol 


286L 
(MC58) 


Fwd 
Rev 


CGCGGATCCCAIATG-CACGACACCCGTAC 
CCCGCTCGAG-TTAGAAGCGCGTTCCCAA 


Ndel 
Xhol 


287L 


Fwd 


CTAGCTAGC-TTTAAACGCAGCGTAATCGCAATGG 


Nhel 


Rev 


CCCGCTCGAG-TCAATCCTGCTCTTTTTTGCC 


Xhol 
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287 


Fwd 


CTAGCTAGC-GGGGGCGGCGnrnnCG 


rsnel 


Rev 


CCCGCTCG AG-TC AATCCTGCTC I'l ' n ' 1 ' 1 T\CC 


YhnT 
AJlOx 


287LOrf4 


Fwd 


CTAGCTAGCGCTCATCCrCGCCncC- 
TGCGGGGGCGGCGGT 


INnei 


Rev 


cccGCTCGAG-TCAATCCTncrr i t n ■ n arc 


Xhol 


287-fta 


Fwd 


CGGGGATCC-GGGGGCGGCGGTnnrn 


BamHI 


Rev 


CCCGgSGAG-TCAATCCTGCTCrrrrriGCC 


Xhol 


287-His 


Fwd 


CTAfiCTAGC-GGGGGCGGCGGTGGCG 


Nhel 


Rev 


CCCGCTCGAG-ATCCTGCTCTTTTTTGCC * 


Xhol 


287-His(2996) 


Fwd 


CTAGCTAGC-TGCGGGGGCGGCGGTGGCG 


Nhel 


Rev 


CCCGCTCGAG- ATCCTGCTCl'l" J ' r 1 ' 1 GCC 


Xhol 


Al 287-His 


Fwd 


CGCGGATCCGCIAQe-CCCGATGTTAAATCGGC * 


Nhel 


A2 287-His 


Fwd 


CGCGGATCCGCTAGC-CAAGATATGGCGGCAGT 5 


Nhel 


A3 287-His 


Fwd 


CGCGGATCCG£IAGC-GCCGAATCCGCAAATCA * 


Nhel 


A4 287-His 


Fwd 


CGCGCT^GC-GGAAGGGTTGATTTGGCTAATGG 8 


Nhel 


A4 287MC58-His 


Fwd 


CGCGCTAGC-GGAAGGGTTGATTTGGCTAATfiO * 


Nhel 


287a-His 


Fwd 


CGCCATATG-TTTAAACGCAGCGTAATCGC 


Ndel 


Rev 


CCCGCXCGAG-AAAATTGCTACCGCCATTCGCAGG 


Xhol 


287b-His 


Fwd 


CGCCATATG-GGAAGGGTTGATTTGGCTAATGG 


Ndel | 


287b-2996-His 


Rev 


CCCGCim^-CTTGTCTTTATAAATGATGACATArTO 


Xhol 


287b-MCS8-His 


Rev 


CCCGCTOGAS-TITATAAAAGATAATATATTGATTGATTCC 


Xhol 


287c-2996-His 


Fwd 


CGCGCTAGC- ATGCCGCTG ATTCCCfiTf! A ATC 8 


Nhel •■• 


t287m *««i, (2996) 


Fwd 


CTAGCTAGC-GGGGGCGGCGGTGGCG 


Nhel 


Rev 


CCCGCTTGAG-TCAATCCTGCTCrrrri'IGCC 


Xhol ..... 


AG287-His * 


Fwd 


CGCGGATCCGCTAGC-CCCGATGTTAAATCGGC 


Nhel - 


Rev 


CCCGCTCGAG-ATCCTGCTCTTTTTTGCC 


Xhol | 


AG287K(2996) 


Fwd 


CGCGGATCCGCTAGC-CCCGATGTTAAATCGGC 


Nhel 


Rev 


CCCGCTCG AG-TC AATCCTGCTC J " I " 1 " i ' 1 " ! GCC 


Xhol .... 


AG287-L 


Fwd 


CGCGGATCCQQTAGC- 

TTTGAACGCAGTGTGATTGCAATGGCTTGTA'l'ri'l'rGCC 
CTTTCAGCCTGT TCGCCCGATGTTAAATCGGCG 


Nhel .„ 


Rev 


CCCGCTCGAG-TCAATCCTGCTCTITTTTGCC 


Xhol 


AG287-Orf4L 


Fwd 


CGCGGATCCGCTAGC- 

AAA ACCTTCTTC A A AACCCTTTCCGCCGCCGC A C^VCGCG 
CTCATCCTCGCCGCCTGC TCGCCCGATGTTAAATCG 


Nhel 


Rev 


CCCGCTCG AG-TCAATCCTGCTC rTTTTl 'GCC. 


Xhol 


292L 


Fwd 


CGCGG ATCCC ATATG- AAAACC A AGTTAATC AA A 


Ndel 


Rev 


CCCGQG^AG-TTATrGATTTTTGCGGATGA 


Xhol 


308-1 


Fwd 


CGCGGATCCCAEATG-TTAAATCGGGTATnTATC 


Ndel 


Rev 


CCCGCTCGAG-TTAATCCGCCATTCCCTG 


Xhol 


401L 


Fwd 


GCGGCCATA1S-AAATTACAACAATTGGCTG 


Ndel 


Rev 


GCGGCCTGGAS-TTACCTrACGTTTTrCAAAG 


Xhol 


406L 


Fwd 


CGCGGATCCCATATG-CAAGCACGGCTGCT 


Ndel 


Rev 


CCCGCTCGAG-TCAAGGTTGTCCTTGTCTA 


Xhol 


502-1L 


Fwd 


CGCGGATCCCATATG-ATGAAACCGCACAAC 


Ndel 


Rev 


CCCG£ECeAS-TCAGTTGCTCAACACGTC 


Xhol 
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502-A(His-GST) 


Fwd 


CGCGGATCCCATATGGTAGACGCGCTTAAGCA 


BamHI-Ndel 


Rev 


CCCGC1CQAQAGCTGCATGGCGGCG 


Xhol 


503-1L 


Fwd 


CGCGGATCCCATATG-GCACGGTCGTTATAC 


Ndel 


Rev 


CCCGCTCQAG-CTACCGCGCATTCCTG 


Xhol 


519-1L 


Fwd 


GCGGCCMMS-GAATTTTTCATTATCTTGTT 


Ndel 


Rev 


GCGGCCTCGAG-TTATTTGGCGGTTTTGCTGC 


Xhol 


S25-1L 


Fwd 


GCGGCCATATG-AAGTATGTCCGGTTATTTTTC 


Ndel 


Rev 


GCGGCCTCGAG-TTATCGGCTTGTGCAACGG 


Xhol 


529-(His/GST) 

(MC38> 


Fwd 


CGCGGATCCGCTAGC-TCCGGCAGCAAAACCGA 


Bam HI-Nhel 


Rev 


GCCCMGdl-ACGCAGTTCGGAATGGAG 


Hindm 


552L 


Fwd 


GCCGCCATATGTTGAATATTAAACTGAAAACCTTG 


Ndel 




Rev 


GCCGCCTCGAGTTATTTCTGATGCCTTTTCCC 


Xhol 


556L 


Fwd 


GCCGCCATATGGACAATAAGACCAAACTG 


Ndel 






GCCGCCTCG AGTT A ACGGTGCGG ACGTTTC 


Ailul 


557L 


Fwd 


CGCGGATCCCMMQ-AACAAACTGTTTCTTAC 


Ndel 




Rev 


CCCGCTCGAG-TCATTCCGCCTTCAGAAA 


Xhol 


564ab-(His/GST) 
(MC58) 


Fwd 


CGCGGATCCCATATG- 
CAAGGTATCGTTGCCGACAAATCCGCACCT 


BamHI-Ndel 


Rev 


CCCGCTCGAG- 

AGCTAATTGTGCTTGGTTTGCAGATAGGAGTT 


Xhol 


564abL(MC58) 


Fwd 


CGCGGATCCCATATG- 

AACCGCACCCTGTACAAAGTTGTATTTAACAAACATC 


Ndel 


Rev 


CCCGdCQAG- 

TTAAGCTAATTGTGCTTGGTTTGCAGATAGGAGTT 


Xhol 


564b- 
(His/GST)(MC58) 


Fwd 


CGCGGATCCCATATG- 

ACGGGAGAAAATCATGCGGTTTCACTTCATG 


BamHI-Ndel 


Rev 


CCCGdQGAQ- 

AGCTAATTGTGCTTGGTTTGCAGATAGGAGTT 


Xhol 


564c- 
(His/GST)(MC58) 


Fwd 


CGCGGATCCCATATG- 

GTTTCAGACGGCCTATACAACCAACATGGTGAAATT 


BamHI-Ndel 


Rev 


CCCGCTCGAG- 

GCGGTAACTGCCGCTTGCACTGAATCCGTAA 


Xhol 


564bc- 
(His/GST)(MCS8) 


Fwd 


CGCGGATCCCATATG- 

ACGGGAGAAAATCATGCGGTTTCACTTCATG 


BamHI-Ndel 


Rev 


CCCGCTCGAG- 

GCGGTAACTGCCGCITGCACTGAATCCGTAA 


Xhol 


564d- 

(His/GST)(MC58) 


Fwd 


CGCGGATCCCATATG- 

CAAAGCAAAGTCAAAGCAGACCATGCCTCCGTAA 


BamHI-Ndel 


Rev 


CCCGCTCGAG- 

TGllUltJCl^l'CAATTATAACTTTAGTAGGTTCAATTTTG 
GTCCCC 


Xhol 


564cd- 
(His/GST)(MC58) 


Fwd 


CGCGGATCCCATATG- 

GTTrCAGACGGCCTATACAACCAACATGGTGAAATT 


BamHI-Ndel 


Rev 


CCCGCTCGAG- 

T( TTTTPTTTrPA ATT ATA At "riTAflTAGGTTCAATTTTG 
GTCCCC 


Xhol 


570L 


Fwd 


GCGGCC^IAIQ-ACCCGTTTGACCCGCG 


Ndel 


Rev 


GCGGCCTG_GA£-TCAGCGGGCGTTCATTTCTT 


Xhol 


576-1L 


Fwd 


CGCGGATCCCATATG-AACACCATTTTCAAAATC 


Ndel 


Rev 


CCCGCTCGAG-TTAATTTACITrnTGATGTCG 


Xhol 
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580L 


Fwd 


GCGGCCAIAIG-GATTCGCCCAAGGTCGG 


Ndel 


Rev 


GCGGCCTCGAG-CTACACTTCCCCCGAAGTGG 


Xhol 


583L 


Fwd 


CGCGGATCCCA1ATG-ATAGTTGACCAAAGCC 


Ndel 


Rev 


CCCGCTCGAG-TTATTIT^ 


Xhol 


593 


Fwd 


GCGGCC AT ATG-CTTG AACTO A ACOa ACT 


Ndel 


Rev 


GCGGCCTCGAG-TCAGCGGAAGCGGACGATT 


Xhol 


650 (His-GST) 
(MC58) 


Fwd 


CGCGGATCCC ATATGTCCA A ACTT A AAA CCATCn 


BamHI-NdeT 


Rev 


CCCGCTCGAGGCTTCCAATCAGTTTGArr 


Xhol 


652 


Fwd 


GCGGCCAIATC-AGCGCAATCXjITGATATITrc 


Ndel 


Rev 


GCGGCCTCGAS-TTATTTGCCCAGTTGGTAGAATG 


Xhol 


664L 


Fwd 


GCGGCCA1A1G-GTGATACATCCGGACTACTTC 


Ndel 


Rev 


GCGGC£T£GAG-TCAAAATCGAGTTTTACACCA 


Xhol 


726 


Fwd 


GCGGCCAIAie-ACCATCTATITCAAAAACGG 


Ndel 


Rev 


GCGGCCTCGAG-TCAGCCGATGTTTAGrnTrrATT 


Xhol 


741-His(MC58) 


Fwd 


CGCGGATCCCAIAIQ-AGCAGCGGAGGGGGTG 


Ndel 


Rev 


CCCGCTCGAG-TTGCTTGGCGGCAAGGC 


Xhol 


AG741-His(MC58) 


Fwd 


CGCGGATCCCATATG-GTCGCCGCCGACATrn 


Ndel 


Rev 


CCCGCTCGAG-TTGCTTGGCGGCAAGGC 


Xhol 


686-2-(Hfc/GST) 
(MC58) 


Fwd 


CGCGGATCCCATATG-GGCGGTTCGGAAGGCG 


BamHI-Ndel 


Rev 


CCCGCTCGAG-TTGAACACTGATGTCTTTTCCGA 


Xhol 


719-(His/GST) 
(MC58) 


Fwd 


CGCGGATCCGCTAGC- AAACTGTrGTTGfiTnTT A AC 


BamHI-Nhel 


Rev 


CCCGCTCQAG-TTGACCCGCTCCACGG 


Xhol 


730-His (MC58) 


Fwd 


GCCGCCATATGGCGGACTTGGCGCAAGACCC 


Ndel 




Rev 


GCCGCCTCGAGATCTCCTAAACCTGTTTTAACAATGCCG 


Xhol 


730A-ffis (MC58) 


Fwd 


GCCGCCATATGGCGGACTTGGCGCAAGACCC 


Ndel 




Rev 


GCGGCCTCGAGCTCCATGCTGTTGCCCCAGC 


Xhol - 


730B-His (MC58) 


Fwd 


GCCGCCATATGGCGGACTTGGCGCAAGACCC 


Ndel 




Rev 


GCGGCCTCGAGAAAATCCCCGGTAACCGCAG 


Xhol 


741-His 
(MC58) 


Fwd 


CGCGGATCCCA1A1G-AGCAGCGGAGGGGGTG 


Ndel ._. 


Rev 


CCCGCTCGAQ-TTGCTTGGCGGCAAGGC 


Xhol 


AG741-His 
(MC58) 


Fwd 


cck:ggatcccaiatg-gtcgccgccgacatcg 


Ndel 


Rev 


CCCGCTCGAG-TTGCTTGGCGGCAAGGC 


Xhol 


743 (His-GST) 


Fwd 


CGCGGATCCCATATGGACGGTGTTGTGCCTGTT 


BamHI-Ndel 

XJ CXI I IX XX i^UVX 


Rev 


CCCGCTCGAGCTTACGGATCAAATTGACG 


Xhol 


757 (His-GST) 
(MC58) 


Fwd 


CGCGG ATCCC ATATGGGC AGCC AATCTG A AG A A 


BamHI-Ndel 


Rev 


CCCGCTCGAGCTCAGCTTTTGCCGTCAA 


Xhol 


759-His/GST 
(MC58) 


Fwd 


CGCGGATCCGCTAGC-TACTCATCCATTGTCCGC 


BamHI-Nhel 


Rev 


CCCGCTCGAG-CCAGTTGTAGCCTATTTTG 


Xhol 


759L 
(MC58) 


Fwd 


CGCGGATCCGCTAGC-ATGCGCTTCACACACAC 


Nhel 


Rev 


CCCGCIGGAG-TTACCAGTTGTAGCCTATTT 


Xhol 


760-His 


Fwd 
Rev 
Fwd 


GCCGCCATATGGCACAAACGGAAGGTTTGGAA 

GCCGCCTCGAGAAAACTGTAACGCAGGTTTGCCGTC 

GCGGCCATATGGAAGAAACACCGCGCGAACCG 


Ndel 
Xhol 
Ndel 




769-His (MC58) 
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Rev 


GCGGCCTCGAGGAACGTTTTATTAAACTCGAC 


Xhol 


907L 


Fwd 


GCGGCCAXA1Q-AGAAAACCGACCGATACCCTA 


Ndel 


Rev 


GCGGCCICQAQ-TCAACGCCACTGCCAGCGGTTG 


Xhol 


911L 


Fwd 


CGCGGATCC£A1A1£.-AAGAAGAACATATTGGAATTTTGGGTCGGACTG 


Ndel 


Rev 


CCCGCTCGAG'TTATTCGGCGGCl'l'L 'I "1CCGCATTGCCG 


Xhol 


911LOmpA 


Fwd 


GGGAATTCCATATGAAAAAGACAGCTATCGCGATTGCA 
nTrsfy 1 A rrryyTnn' r i * i 'pry^r a rrr.T a ru^nr* a finer* nr> 

TAGjQ-GCTTTCCGCGTGGCCGGCGGTGC 


NdeI-(NheI) 


Rev 


cccGcjGG^G-rrAritjGucGGcrrrri'ccGCA'riGccG 


Xhol 


911LPelB 


Fwd 


CATGC£AlQfi-CTTTCCGCGTGGCCGGCGGTGC 


Ncol 


Rev 


CCCGCiniM-TTATTCGGCGGCTTTrTCCGCATTGCCG 


Xhol 


913-His/GST 
(MC58) 


Fwd 
Rev 


CGCGGATCCCATATG-TTTGCCGAAACCCGCC 
CCCGCTCGAG-AGGTTGTGTTCCAGGTTG 


Bamffl-Ndel 
Xhol 


913L 
(MC58) 


Fwd 


CGCGGATCCC^TATG-AAAAAAACCGCCTATG 


Ndel 


Rev 


CCCGCTCGAG-TTAAGGTTGTGTTCCAGG 


Xhol 


919L 


Fwd 


CGCGGATCCCATATG-AAAAAATACCTATTCCGC 


Ndel 


Rev 


CCCGdOSAS-TTACGGGCGGTATTCGG 


Xhol 


919 


Fwd 


CGCGGATCCCATATG-CAAAGCAAGAGCATCCAAA 


Ndel 


Rev 


CCCGCTCGAG-TTACGGGCGGTATTCGG 


Xhol 


919LOrf4 


Fwd 


GGGAATTCCAXAIGAAAACCTTCTTCAAAACCCTTTCCG 

CCOLCuCUCTAGCGCTCATCCICGCCGCC- 

TGCCAAAGCAAGAGCATC 


NdeHNhel) 


Rev 


CCCGCTCGAG-TTACGGGCGGTATTCGGGCTTCATACCG 


Xhol 


(919)-287fusion 


Fwd 


CGCGGATCCGTCGAC-TGTGGGGGCGGCGGTGGC 


Sail 


Rev 


CCCGCTCGAG-TCAATCCTGCTCi'ri"l"llGCC 


Xhol 


n^A it 
920- 1L 


rwd 


/""l /^/*"l/"5*"V"l ATA TV* 1 A A/^A AA A /~* A "TTT A /"< A / * 

CjCOCjCC A 1 ATG- AAG AAAAC A 1 1 G ACACTGC 


Ndel 


Rev 


GCGGCCTCGAG-TTAATGGTGCGAATGACCGAT 


Xhol 


yZi-rlis/ljra i 
(MC58) GATO 


rwd 
Rev 


ggggacaagtttgtacaaaaaagcaggctTGCGGCAAGGATGCCGG 
ggggaccactttgtacaagaaagctgggtCTAAAGCAACAATGCCGG 


anal 
attB2 




926L 


Fwd 


CGCGGATCCCATATG-AAACACACCGTATCC 


Ndel 


Kev 


LLLut 1 LuAVj- 1 1 A 1 U 1 ULi 1 GCLit-LiLA- 


Anol 


927-2-(His/GST) 
(MC58) 


Fwd 


CGCGGATCCCATATG-AGCCCCGCGCCGATT 


BamHI-Ndel 


Rev 


CCCGCJCGAG.TTTTTGTGCGGTCAGGCG 


Xhol 


932-His/GST 
(MC58) CATC 


Fwd 


ggggacaagmgtacaaaaaagcaggctTGTTCGTTTGGGGGATTTAA 
ACCAAACCAAATC 


attBl 


93S (His-GST) 
(MC58) 


For 


CGCGGATCCCATATGGCGGATGCGCCCGCG 


BamHI-Ndel 


Kev 


cnnnrTnn a n a a a eerier* a a Tr^nnrr* 


Anol 


936-1L 


Rev 


ggggaccactttgtacaagaaagctgggtlCAi 1 1 iui 1 1 1 iLCi rc I rcr 
CGAGGCCATT 


attB2 


Fwd 


CGCGGATCCCATATG-AAACCCAAACCGCAC 


Ndel 


Rev 


CCCGCTCGAG-TCAGCGTTGGACGTAGT 


Xhol 


953L 


Fwd 


GGGAATTCCATATG-AAAAAAATCATCTTCGCCG 


Ndel 


Rev 


CCCGCTCG_AG/ITATTGTTTGGCTGCCTCGAT 


Xhol 


9S3-fu 


Fwd 


GGGAATTCCATATG-GCCACCTACAAAGTGGACG 


Ndel 


Rev 


CGGGG^JCC-TTGTTTGGCTGCCTCGATTTG 


Bamffl 
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954 (His-GST) 
(MC58) 


Fwd 


CGCGGATCCCATATGCAAGA AC \A ATTGC AG A A Aft 


BamHI-Ndel 


Rev 


cccGcrcGAG'rrrri'rcGGCAAATTnnrnr 


/VI1UJL 


958-His/GST 
(MC58) CATB 


Fwd 
Rev 


eseeacaaetttetacaaaaaaecaeectGCCGATGCCGTTGCnn 
ggggaccactttgtacaagaaagctgggtTCAGGGTCGTTTGTTGCG 


ntiR / 

attBl 


961L 


Fwd 


CGCGGATCC£AIMG-AAACACnTCCATCC 


Ndel 


Rev 


CCCGCTCGAG-TTACCACTCGTAATTGAC 


Xhol 


961 


Fwd 


CGCGGATCCCAIAIQ-GCCACAAGCGACGAC 


Ndel 


Rev 


CCCGSEeQAQ-TTACCACTCGTAATTGAC 


Xhol 


961 c (His/GST) 


Fwd 


CGCGGATCCCATATG-GCCACAAACGACG 


BamHI-Ndel 


Rev 


CCCGCTCGAG-ACCCACGTTGTAAGGTTG 


Xhol 


961 c-(His/GST) 
(MC58) 


Fwd 


CGCGGATCCCATATG-GCCACAAGCGACGACOA 


BamHI-Ndel 


Rev 


CCCGC1CGAQ-ACCCACGTTGTAAGGTTG 


Xhol 


961 c-L 


Fwd 


CGCGGATCCCAIAIQ-ATGAAACACTTTCCATCC 


Ndel 


Rev 


CCCGglCQAfi-TTAACCCACGTTGTAAGGT 


Xhol 


961 c-L 
(MC58) 


Fwd 


CGCGGATCCCAJAIG-ATGAAACACTTTCCATCC 


Ndel 


Rev 


CCCGCTCGAIi-TTAACCCACGTTGTAAGGT 


Xhol 


961 d (His/GST) 


Fwd 


CGCGGATCCCATATG-GCCACAAACGACG 


BamHI-Ndel 




Rev 


CCCGCTCGAG-GTCTGACACTGTTTTATCC 


Xhol 


961A1-L 


Fwd 


CGCGGATCCCATATG-ATGAAACACTTTCCATCC 


Ndel 


Rev 


CCCGCJCGAQ-TTATGCTTTGGCGGCAAAG 


Xhol 


fu 961-... 


Fwd 


CGCGGATCCCATATG- GCCACAAACGACGAC 


Ndel 


Rev 


CGCGiiATCC-CCACTCGTAATTGACGCC 


BamHI 


fu 961-... 
(MC58) 


Fwd 


CGCGGATCCCAJATjQ-GCCACAAGCGACGAC 


Ndel 


Rev 


CGCGGATCC-CCACTCGTAATTGACGCC 


BamHI 


fu961c-... 


Fwd 


CGCGGATCCCAIA1G-GCCACAAACGACGAC 


Ndel 




Rev 


CGCGGATCC -ACCCACGTTGTAAGGTTG 


RamHT 

U CHILI. XX 


fu 961 c-L-... 


Fwd 


CGCGGATCCCATATG- ATGAAACACTTTCCATCC 


Ndel 


Rev 


CGCGGATCC -ACCCACGTTGTAAGGTTG 


BamHI 

XJ OA-ILL A-L 


fu (961 )- 
741(MC58)-His 


Fwd 


CGCGGATCC -GGAGGGGGTGGTGTCG 


"RamHT 

U ami 11 


Rev 


CCCGCTCGAG-TTGCTTGGCGGCAAGGC 


Xhol 

XVI IV X 


fu (961 )-983-His 


Fwd 


CGCGGATCC - GGCGGAGGCGGCACTT 


RamHT 

XJ <3 11 LI 11 


Rev 


CCCGCECQAQ-GAACCGGTAGCCTACG 


Xhol 


fu (961)- Orf46.1- 
His 


Fwd 


CGCGGATCCGGTGGTGGTGGT- 
TCAG ATTTGGC A A ACG ATTP 


BamHI 


Rev 


CCCGCTCGAG-CGTATCATATTTCACGTGC 


Xhol 


fu (961 c-L)~ 
741(MC58) 


Fwd 


CGCGGATCC -GGAGGGGGTGGTGTCG 


BamHI 




Rev 


CCCGCTCGAG-TTATTGCTTGGCGGCAAG 


Xhol 


lu ^yoic-ju j-yoo 


1 WU 




DdULTll 


Rev 


CCCGCTCGAG-TCAGAACCGGTAGCCTAC 


Xhol 


fu (961c-L). 
Orf46.1 


Fwd 


CGCGGATCCGGTGGTGGTGGT- 
TCAGATTTGGCAAACGATTC 


BamHI 


Rev 


CCCGCTCGAG-TTACGTATCATATTTCACGTGC 


Xhol 


961-(His/GST) 


Fwd 


CGCGGATCCCATATG-GCCACAAGCGACGACG 


BamHI-Ndel 
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(MC58) 


Rev 


CCCGCTCG^a-CCACTCGTAATTGACGCC 


Xhol 


961A1-His 
961a-(His/GST) 


Fwd 


CGCGGATCCCAIAIG-GCCACAAACGACGAC 


Ndel 


Rev 


CCCGCTTSAQ-TGCTTTGGCGGCAAAGTT 


Xhol 


Fwd 
Rev 


cck:ggatcccaiaig-gccacaaacgacgac 
cccgct^gag-tttagcaatattatctitgttcgtagc 


BamHI-Ndel 
Xhol 


961b-(His/GST) 


Fwd 


CGCGGATCCCATATQ-AAAGCAAACCGTGCCGA 


BamHI-Ndel 


Rev 


CCCGCTCGAG-CCACTCGTAATTGACGCC 


Xhol 


961-His/GST GATE 


Fwd 
Rev 


ggggacaagtttgtacaaaaaagcaggctGCAGCCACAAACGACGACG 
ATGTTAAAAAAGC 

ggggaccactttgtacaagaaagctgggtTTACCACTCGTAATTGACGC 
CGACATGGTAGG 


attBl 
attB2 




982 


Fwd 


GCGGCCATATG-GCAGCAAAAGACGTACAGTT 


Ndel 


Rev 


GCGGCCTCGAG-TTACATCATGCCGCCCATACCA 


Xhol 


983-His (2996) 


Fwd 


CGCGGATCCGCTAGC-TTAGGCGGCGGCGGAG 


Nhel 


Rev 


CCCGCTCGAG-GAACCGGTAGCCTACG 


Xhol 


AG983-His (2996) 


Fwd 


CCCCTAGCTAGC-ACTTCTGCGCCCGACTT 


Nhel 


Rev 


CCCGCTCGAS-GAACCGGTAGCCTACG 


Xhol 


983-His 


Fwd 


CGCGGATCCGCTAGC-TTAGGCGGCGGCGGAG 


Nhel 


Rev 


CCCGCTCGAG-GAACCGGTAGCCTACG 


Xhol 


AG983-His 


Fwd 


CGCGGATCCGCTAGC-ACITCTGCGCCCGACTT 


Nhel 


Rev 


CCCGCTCGAG-GAACCGGTAGCCTACG 


Xhol 


983L 


Fwd 


CGCGGATCCGCTAGC- 

CGAACGACCCCAACCTTCCCTACAAAAACTTTCAA 


Nhel 


Rev 
Fwd 
Rev 


CCCGCTCGAG-TCAGAACCGACGTGCCAAGCCGTTC 

GCCGCCATATGCCCCCACTGGAAGAACGGACG 

GCCGCCTCGAGTAATAAACCTTCTATGGGCAGCAG 


Xhol 
Ndel 
Xhol 


987-His (MC58) 




989-(His/GST) 
(MC58) 


Fwd 


CGCGGATCCCATATG-TCCGTCCACGCATCCG 


BamHI-Ndel 


Rev 


CCCGCTCGAG-TTTGAATTTGTAGGTGTATTG 
CGCGGATCCCATATG-ACCCCTTCCGCACT 


Xhol 
Ndel 


989L 
(MC58) 


Fwd 


Rev 


CCCGCTGGAjG-TTATTTGAATTTGTAGGTGTAT 


Xhol 


CrgA-His 
(MC58) 


Fwd 


CGCGGATCCCATATG-AAAACCAATTCAGAAGAA 


Ndel 


Rev 
Fwd 


CCCGCTCGAG-TCCACAGAGATTGTTTCC 
GATGCCCGAAGGGCGGG 


Xhol 


PUC1-ES 
(MC58) 


Rev 


GCCCAAGCTT-TCAGAAGAAGACTTCACGC 




PilCl-His 
(MC58) 


Fwd 


CGCGGATCCCATATG-CAAACCCATAAATACGCTATT 


Ndel 


Rev 


GCCCAAGCTT-GAAGAAGACTTCACGCCAG 


Hindm 


AlPUCl-ffis 
(MC58) 


Fwd 


CGCGGATCCCATATG-GTCTTTTTCGACAATACCGA 


Ndel 


Rev 


GCCCAAGCTT- 


Hindm 


PilCIL 
(MC58) 


Fwd 
Rev 


CGCGGATCCCATATG-AATAAAACTTTAAAAAGGCGG 
GCCCAAGCTT-TCAGAAGAAGACTTCACGC 


Ndel 
Hindm 


AGTbp2-His 
(MC58) 


Fwd 


CGCGAATCCCAIMe-TTCGATCTTGATTCTGTCGA 


Ndel 


Rev 


CCCGCTCGAG-TCGCACAGGCTGTTGGCG 


Xhol 


Tbp2-His 
(MC58) 


Fwd 


CGCGAATCCCAIMS-TTGGGCGGAGGCGGCAG 


Ndel 


Rev 


CCCGCTCjSAG-TCGCACAGGCTGTTGGCG 


Xhol 


Tbp2-His(MC58) 


Fwd 


CGCGAATCCCATATG-TTGGGCGGAGGCGGCAG 


Ndel 


Rev 


CCCGCTCGAG-TCGCACAGGCTGTTGGCG 


Xhol 
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NMB0109- 
(Hls/GST) 
(MCS8) 


Fwd 


CGCGGATCCCATATG-GCA A ATTTnn Aanrncar 


d ainrii-iN aei 


Rev 


CCCGCTCGAG-TTCGGAGGGGTTn A AHT 


XhnT 

■/V11U1 


NMB0109L 

(MCS8) 


Fwd 


CGCGG ATCCC ATATG-C AACGTCGT ATT AT A A CCC 




Rev 


CCCGCICGAG-TTATTCGGAGCGGTTGAAG 


Xhol 


NMB0207- 
(His/GST) 
(MC58) 


Fwd 


CGCGGATCCCATATG- 

C\nC ATP A A AGTCGCC ATP A APOOTT A r* 


BamHI-Ndel 


Rev 


CCCGCICGAG-TTTGAGCGGGCGCACTTCAAGTCCG 


Xhol 


NMB0462* 
(His/GST) 

(MC58) 


Fwd 


CGCGGATCCCATATG-GGCGGCAnrr, A AAAAAAP 


BamHI-Ndel 


Rev 


CCCGCTCQACi-GTTGGTGCCGACTTTGAT 


Xhol 


NMB0623- 

(His/GST) 
(MC58) 


rwd 


CGCGGATCCCATATG-GGCGGCGr.AArtrnATA 


BamHI-Ndel 


Rev 


CCCGCIQSAG-TTTGCCCGCITTGAGCC 


Xhol 


NMB0625 (His- 
GST)(MC58) 


Fwd 


CGCGGATCCC ATATGGGC AAATCCG A A A AT APO 


BamHI-Ndel 


Rev 


CCCGO£GAGCATCCCGTACTGTTTCG 


Xhol 


NMB0634 

(Hls/GST)(MCS8) 


Fwd 


ggggacaagtttgtacaaaaaagcaggctCCGACATTACCGTGTACAAC 
GGCCAACAAAGAA 


attBl 


Rev 


ggggaccacmgtacaagaaagctgggtCITATTTCATACCGGK:TTGCT 
CAAGCAGCCGG 


attB2 


NMB0776- 
His/GST (MCS8) 

GATE 


Fwd 
Rev 


ggggacaagtttgtacaaaaaagcaggctGATACGGTGlTTTCCTGTAA 
AACGGACAACAA 

ggggaccactttgtacaagaaagctgggtCTAGGAAAAATCGTCATCGT 
TGAAATTCGCC 


attBl 
attB2 


NMB1115- 
His/GST (MC58) 

GATE 


Fwd 
Rev 


ggggacaagtttgtacaaaaaagcaggctATGCACCCCATCGAAACC 
eeegaccactttstacaaeaaagctffe^CTAGTCTrGCAGTGCCTC 


attBl 
attB2 


NMB1343- 

(His/GST) 
(MC58) 


Fwd 


CGCGGATCCCATATG- 
GGAAATTTCTTATATAGAGGCATTAG 


BamHI-Ndel 


Rev 


CCCGCTCGAG- 

GTTAATTTCTATCAACTCTTTAGCAATAAT 


Xhol 


NMB1369 (His- 


Fwd 


CGCGGATCCCATATGGCCTGCCAAGACGACA 


BamHI-Ndel 


Rev 


CCCGCTCG&GCCGCCTCCTGCCGAAA 


Xhol 


NMB1551 (His- 


Fwd 


CGCGGATCCCATATGGCAGAGATCTOTTTfi AT A A 


BamHI-Ndel 


Rev 


CCCGCTCGAGCGGTTTTCCGCCCAATG 


Xhol 


NMB1899 (His- 


Fwd 


CGCGGATCCCATATGCAGCCGGATACfifiTC 


BamHI-Ndel 


Rev 


CCCGCrCGAGAATCACTTCCAACACAAAAT 


Xhol 


NMB2050- 

(His/GST) 

(MCS8) 


Fwd 


CGCGGATCCCATATG-TGGTTGCTGATGAAGGGC 


BamHI-Ndel 


Rev 


CCCGCTCGAG-GACTGCTTCATCTTCTGC 


Xhol 


NMB2050L 
(MC58) 


Fwd 


CGCGGATCCCATATG-GAACTGATGACTGTTTTnC 


Ndel 


Rev 


CCCGCTCGAG-TCAGACTGCTTCATCTTCT 


Xhol 


NMB2159- 
(His/GST) 
(MC58) 

fu-AG287...-His 


Fwd 


CGCGGATCCCATATG- 
AGCATTAAAGTAGCGATTAACGGTTTCGGC 


BamHI-Ndel 


Rev 


CCCGCTCGAG- 

GATTTTGCCTGCGAAGTATTCCAAAGTGCG 


Xhol 


Fwd 


CGCGGATCCGCEAGC-CCCGATGTTAAATCGGC 


Nhel 
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Rev 


CGGQQAJ^-ATCCTGCTCTTnTTGCCGG 


BamHI 


ftj-(AG287)-919- 
His 


-i-> 1 

Fwd 


CGCGGATCCGGTGGTGGTGGT- 
CAAAGCAAGAGCATrCAAACC 


BamHI 


Rev 


CCCAAGCTT-TTCGGGCGGTATTCGGGCTTC 


HinriTTT 

XXlllUXXX 


fu-fAG287V953- 
His 


Fwd 


CGCGGATCCGGTGGTGGTGGT- — 
GCCACCTACAAAGTGGAC 


xj emu 11 


Rev 


GCCCAAGCTT-TTGTTTGGCTGCCTCGAT 


XXIUUXXX 


His 


Fwd 


CGCGGATrrnnTnnTnfiTGnT-ArAAnrnArnArn 


B am UT 
JjaULTll 


Rev 


GCCCAAGCTT-CCACTCGTAATTGACGCC 


HindUl 


fu-(AG287)- 

s~\ fA £ ■* TY2- 

(_>rf46.1-His 


Fwd 


CGCGGATCCGGTGGTGGTGGT- 

TV A A TTTYW 1 AAA /"V* A TTV 

1CAUAI 1 IljvjCAAACOAl 1T_ 


BamHI 


Rev 


cccaagcii<:gtatcatatttcacgtgc 


Hindffl 


fu-(AG287-919)- 
Orf 46.1 -His 


Fwd 


CCCAAGCTTGGTGGTGGTGGTGGT- 
TCAGA1 i TGGCAAACGATTC 


Hindm 


Rev 


CCCGCTCGAG-CGTATCATATxTCACGTGC 


Xhol 


fu-(AG287- 
urt4o.i j-yiy-His 


Fwd 


CCCAAGCTTGGTGGTGGTGGTGGT- 


Hindm 


Rev 


CCCGCTCGAG-CGGGCGGTATTCGnGCTT 


yvjjui 


fii Ad87f"*<)4 QftY. 
*u Aivjr / \ jy*9*yo ) m 

••• 


Fwd 


CGCGGATCCGCTAGC-CCCGATGTTAAATCGGC 


l^llICl 


IXC V 


VrfVJVJUUA X v^\-*~ r\. X X VJ\_ 1L1 11111 VJ\_V_ Vj VJ 


Dairiru. 


fn Hi-fi JC\rt±(\ n. 

His 


Fwd 


PHPrSOATPPHPTAHP nnAPAPAPTTATTTTPiriPATP 


iNnei 


Pair 


pPfPnn a tpp j^p a rsfrwr a p,ppt a a tttti a t 

LuLUUA i V^i^-C,l^A\J\^VJVJ 1 AwCC 1 A A 111 LlA 1 




fii (Orfl)-Orf46.1- 
His 


rwu 


TCAGATTTGGCAAACGATTC 


D Qm UT 

oamrll 


Rev 


CCCAAGCTT-CGTATCATATTTCACGTGC 


Hindm 


fu (919)-Orf46.1- 
His 


Fwdl 


GCGGCGjmACGGTGGCGGAGGCACTGGATCCTCAG 


Sail 


Fwd2 


GGAGGCACTGGATCCTCAGATTTGGCAAACGATTC 




Rev 


CCCGCTCGAG-CGTATCATATTTCACGTGC 


Xhol 


Fu orf46-.... 


Fwd 


GGAATTCCATATGTCAGATTTGGCAAACGATTC 


Ndel 


T?pv 

IVOV 


CnCCifi A TCCCCiT A TP A T A TTTP A CCTCiC 


Ram ITT 


ru (_ori40j"^o / -nis 


rwu 




p gm UT 

jsanuii 


Rev 


CCCAAGCTTATCCTGCTCTTTTTTGCCGGC 


Hindm 


Fu (orf46)-919-His 


Fwd 


CGCGGATCCGGTGGTGGTGGTCAAAGCAAGAGCATCCA 
AACC 


BamHI 


Rev 


CCCAAQ£3TCGGGCGGTATTCGGGCTTC 


Hindm 


Fu (orf46-919)- 

aom TT»_ 

287-His 


Fwd 


CCCCMQCHGGGGGCGGCGGTGGCG 


Hindm 


Rev 


CCCGCTCGAGATCCTGCTCTTTTTTGCCGGC 


Xhol 


Fu (orf46-287)- 
919-His 


Fwd 


CCCAAGCTTGGTGGTGGTGGTGGTCAAAGCAAGAGCAT 
CCAAACC 


Hindffl 


Rev 


CCCGCTCGAGCGGGCGGTATTCGGGCTT 


Xhol 


(AG741 )-961c-His 


rwai 
Fwd2 


uuAUuLAt 1 UuA 1 CCuC AUCC AUAAACU ACUACUA 
GCGGCCTCGAG-GGTGGCGGAGGCACTGGATCCGCAG 


AilOi 


Rev 


CCCGCTCGAG-ACCCAGICTTGTAAGG'xTG 


Xhol 


(AG741 ).961-His 


Fwdl 
Fwd2 


GGAGGCACTGGATCCGCAGCCACAAACGACGACGA 
GCGGCC1CGAQ-GGTGGCGGAGGCACTGGATCCGCAG 


Xhol 


Rev 


CCCGCTCGAQ-CCACTCGTAATTGACGCC 


Xhol 
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(AG741 )-983-His 


Fwd 


GGATCCGGCGGAGGCGGCACTTCTGCG 


Xhol 


Rev 


CCCGCTCGAG-GAACCGGTAGCCTACG 


Xhol 


(AG741 )-orf46.1- 
His 


Fwdl 
Fwd2 


GG AGGCACTGG ATCCTC AG ATTTfinr A A Am attp 
GCGGCGTCGACGGTGGCGGAGGCACTGGATCCTCAGA 


Sail 


Rev 


CCCGCTCGAG-CGTATCATATTTCACGTGC 


Xhol 


(AG983V 
741(MC58) -His 


Fwd 


VJ \-»VJVJ\— X V-'VJi^VJ-Vj\JrV X ^^UVJHl TV TV IX II T 1 I 1 1 t 1 I T 1 1 1 ji 1 


Xhol 


Rev 


CCCGCTCGAG-rrGCTTGGCGGT A AH 


Xhol 


(AG983)-961c-His 


Fwdl 
Fwd2 


GG AGGC ACTGG ATCCGC AGCC AC AAA rn A rn a r n a 
GCGGCCTCGAG-GGTGGCGGAGGCACTGGATCCGCAG 


Anol 




Rev 


CCCGCTOGAQ-ACCCAGCTTGTAAGGTTG 


Xhol 


(AG983)-961-His 


Fwdl 
Fwd2 


GG AGGC ACTGG ATCCGT AGCC A c A A A rn a rn a rr, a 
GCGGCCTCGAG-GGTGGCGGAGGCACTGGATCCGCAG 


Xhol 


Rev 


CCCGCTCGAG-CCACTCGTAATTGACGCC 


Xhol 


(AG983)-Orf4«.l- 
His 


Fwdl 
Fwd2 


GGAGGCACTGGATCCTCAGATTTGGCAAACGATTC 
GCGGCGTCGACGGTGGCGGAGGCACTGGATCCTCAGA 


Sail 


Rev 


CCCGCJCG^-CGTATCATATTrCACGTGC 


Xhol 



* This primer was used as a Reverse primer for all the C terminal fusions of 287 to the His-tag. 

§ Forward primers used in combination with the 287-His Reverse primer. 
NB- All PCR reactions use strain 2996 unless otherwise specified {e.g. strain MC58) 



In all constructs starting with an ATG not followed by a unique Nhel site, the ATG codon is 
5 part of the Ndel site used for cloning. The constructs made using Nhel as a cloning site at the 
5' end (e.g. all those containing 287 at the N-temxinus) have two additional codons (GCT 
AGC) fused to the coding sequence of the antigen. 

Preparation of chromosomal DNA templates 

N.meningitidis strains 2996, MC58, 394.98, 1000 and BZ232 (and others) were grown to 
10 exponential phase in 100ml of GC medium, harvested by centrifugation, and resuspended in 
5ml buffer (20% w/v sucrose, 50mM Tris-HCl, 50mM EDTA, pH8). After 10 minutes 
incubation on ice, the bacteria were lysed by adding 10ml of lysis solution (50mM NaCl, 1% 
Na-Sarkosyl, 50|jg/ml Proteinase K), and the suspension incubated at 37 a C for 2 hours. Two 
phenol extractions (equilibrated to pH 8) and one CHCla/isoamylalcohol (24:1) extraction 
15 were performed. DNA was precipitated by addition of 0.3M sodium acetate and 2 volumes 
of ethanol, and collected by centrifugation. The pellet was washed once with 70%(v/v) 
ethanol and redissolved in 4.0ml TE buffer (lOmM Tris-HCl, ImM EDTA, pH 8.0). The 
DNA concentration was measured by reading OD260. 
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PCR Amplification 

The standard PCR protocol was as follows: 200ng of genomic DNA from 2996, MC5810QO, 
or BZ232 strains or lOng of plasmid DNA preparation of recombinant clones were used as 
template in the presence of 40(liM of each oligonucletide primer, 400-800 |uM dNTPs 
5 solution, lx PCR buffer (including 1.5mM MgCl 2 ), 2.5 units TaqI DNA polymerase (using 
Perkin-Elmer AmpliTaQ, Boerhingher Mannheim Expand™ Long Template). 

After a preliminary 3 minute incubation of the whole mix at 95°C, each sample underwent a 
two-step amplification: the first 5 cycles were performed using the hybridisation temperature 
that excluded the restriction enzyme tail of the primer (T m i). This was followed by 30 cycles 
10 according to the hybridisation temperature calculated for the whole length oligos (T^). 
Elongation times, performed at 68°C or 72°C, varied according to the length of the Orf to be 
amplified. In the case of Orf 1 the elongation time, starting from 3 minutes, was increased by 
15 seconds each cycle. The cycles were completed with a 10 minute extension step at 72°C 

The amplified DNA was either loaded directly on a 1% agarose gel. The DNA fragment 
15 corresponding to the band of correct size was purified from the gel using the Qiagen Gel 
Extraction Kit, following the manufacturer's protocol. 

Digestion of PCR fragments and of the cloning vectors 

The purified DNA corresponding to the amplified fragment was digested with the 
appropriate restriction enzymes for cloning into pET-21b+, pET22b+ or pET-24b+. Digested 
20 fragments were purified using the QIAquick PCR purification kit (following the 
manufacturer's instructions) and eluted with either H 2 0 or lOmM Tris, pH 8.5. Plasmid 
vectors were digested with the appropriate restriction enzymes, loaded onto a 1.0% agarose 
gel and the band corresponding to the digested vector purified using the Qiagen QIAquick 
Gel Extraction Kit. 

25 Cloning 

The fragments corresponding to each gene, previously digested and purified, were ligated 
into pET21b+, pET22b+ or pET-24b+. A molar ratio of 3:1 fragment/vector was used with 
T4 DNA ligase in the ligation buffer supplied by the manufacturer. 

Recombinant plasmid was transformed into competent Kcoli DH5 or HB101 by incubating 
30 the ligase reaction solution and bacteria for 40 minutes on ice, then at 37°C for 3 minutes. 
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This was followed by the addition of 800jil LB broth and incubation at 37°C for 20 minutes. 
The cells were centrifuged at maximum speed in an Eppendorf microfuge, resuspended in 
approximately 200|il of the supernatant and plated onto LB ampicillin (lOOmg/ml ) agar. 

Screening for recombinant clones was performed by growing randomly selected colonies 
5 overnight at 37°C in 4.0ml of LB broth + 100ng/ml ampicillin. Cells were pelleted and 
plasmid DNA extracted using the Qiagen QIAprep Spin Miniprep Kit, following the 
manufacturer's instructions. Approximately l\xg of each individual miniprep was digested 
with the appropriate restriction enzymes and the digest loaded onto a 1-1.5% agarose gel 
(depending on the expected insert size), in parallel with the molecular weight marker (lkb 
10 DNA Ladder, GIBCO). Positive clones were selected on the basis of the size of insert. 

Expression 

After cloning each gene into the expression vector, recombinant plasmids were transformed 
into Rcoli strains suitable for expression of the recombinant protein. l|il of each construct 
was used to transform Rcoli BL21-DE3 as described above. Single recombinant colonies 

15 were inoculated into 2ml LB+Amp (100|ig/ml), incubated at 37°C overnight, then diluted 
1 :30 in 20ml of LB+Amp (lOOpg/ml) in 100ml flasks, to give an OD 60 o between 0. 1 and 0.2. 
The flasks were incubated at 30°C or at 37°C in a gyratory water bath shaker until OD 6 oo 
indicated exponential growth suitable for induction of expression (0.4-0.8 OD). Protein 
expression was induced by addition of l.OmM DPTG. After 3 hours incubation at 30°C or 

20 37°C the OD^x) was measured and expression examined. 1.0ml of each sample was 
centrifuged in a microfuge, the pellet resuspended in PBS and analysed by SDS-PAGE and 
Coomassie Blue staining. 

Gateway cloning and expression 

Sequences labelled GATE were cloned and expressed using the GATEWAY Cloning 
25 Technology (GIBCO-BRL). Recombinational cloning (RQ is based on the recombination 
reactions that mediate the integration and excision of phage into and from the E.coli genome, 
respectively. The integration involves recombination of the attP site of the phage DNA within 
the attB site located in the bacterial genome (BP reaction) and generates an integrated phage 
genome flanked by attL and attR sites. The excision recombines attL and attR sites back to attP 
30 and attB sites (LR reaction). The integration reaction requires two enzymes [the phage protein 
Integrase (Int) and the bacterial protein integration host factor (IHF)] (BP clonase). The 
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excision reaction requires Iht, MF, and an additional phage enzyme, Excisionase (Xis) (LR 
clonase). Artificial derivatives of the 25-bp bacterial attB recombination site, referred to as Bl 
and B2, were added to the 5' end of the primers used in PCR reactions to amplify Neisserial 
ORFs. The resulting products were BP cloned into a "Donor vector" containing complementary 
5 derivatives of the phage attP recombination site (PI and P2) using BP clonase. The resulting 
"Entry clones" contain ORFs flanked by derivatives of the attL site (LI and 12) and were 
subcloned into expression "destination vectors" which contain derivatives of the attL- 
compatible attR sites (Rl and R2) using LR clonase. This resulted in "expression clones" in 
which ORFs are flanked by B 1 and B2 and fused in frame to the GST or His N terminal tags. 

10 The K coli strain used for GATEWAY expression is BL21-SI. Cells of this strain are induced 
for expression of the T7 RNA polymerase by growth in medium containing salt (0.3 M NaCl). 

Note that this system gives N-terminus His tags. 
Preparation of membrane proteins. 

Fractions composed principally of either inner, outer or total membrane were isolated in 
15 order to obtain recombinant proteins expressed with membrane-localisation leader 
sequences. The method for preparation of membrane fractions, enriched for recombinant 
proteins, was adapted from Filip et. al [J.Bact. (1973) 115:717-722] and Davies et. al 
[JJmnutnoLMeth (1990) 143:215-225]. Single colonies harbouring the plasmid of interest 
were grown overnight at 37°C in 20 ml of LB/Amp (100 |ig/ml) liquid culture. Bacteria were 
20 diluted 1:30 in 1.0 L of fresh medium and grown at either 30°C or 37°C until the OD 550 
reached 0.6-0.8. Expression of recombinant protein was induced with IPTG at a final 
concentration of 1.0 mM. After incubation for 3 hours, bacteria were harvested by 
centrifugation at 8000g for 15 minutes at 4°C and resuspended in 20 ml of 20 mM Tris-HCl 
(pH 7.5) and complete protease inhibitors (Boehringer-Mannheim). All subsequent 
25 procedures were performed at 4°C or on ice. 

Cells were disrupted by sonication using a Branson Sonifier 450 and centrifuged at 5000g 
for 20 min to sediment unbroken cells and inclusion bodies. The supernatant, containing 
membranes and cellular debris, was centrifuged at 50000g (Beckman Ti50, 29000rpm) for 
75 min, washed with 20 mM Bis-tris propane (pH 6.5), 1.0 M NaCl, 10% (v/v) glycerol and 
30 sedimented again at 50000g for 75 minutes. The pellet was resuspended in 20mM Tris-HCl 
(pH 7.5), 2.0% (v/v) Sarkosyl, complete protease inhibitor (1.0 mM EDTA, final 
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concentration) and incubated for 20 minutes to dissolve inner membrane. Cellular debris was 
pelleted by centrifugation at 5000g for 10 min and the supernatant centrifiiged at 75000g for 
75 minutes (Beckman Ti50, 33000rpm). Proteins 008L and 519L were found in the 
supernatant suggesting inner membrane localisation. For these proteins both inner and total 
5 membrane fractions (washed with NaCl as above) were used to immunise mice. Outer 
membrane vesicles obtained from the 75000# pellet were washed with 20 mM Tris-HCl (pH 
7.5) and centrifiiged at 75000g for 75 minutes or overnight. The OMV was finally 
resuspended in 500 |il of 20 mM Tris-HCl (pH 7.5), 10% v/v glycerol. OrflL and Orf40L 
were both localised and enriched in the outer membrane fraction which was used to 
10 immunise mice. Protein concentration was estimated by standard Bradford Assay (Bio-Rad), 
while protein concentration of inner membrane fraction was determined with the DC protein 
assay (Bio-Rad). Various fractions from the isolation procedure were assayed by SDS-PAGE. 

Purification ofHis-tagged proteins 

Various forms of 287 were cloned from strains 2996 and MC58. They were constructed with 

15 a C-terminus His-tagged fusion and included a mature form (aa 18-427), constructs with 
deletions (Al, A 2, A3 and A4) and clones composed of either B or C domains. For each 
clone purified as a His-fusion, a single colony was streaked and grown overnight at 37°G on 
a LB/Amp (100 pg/ml) agar plate. An isolated colony from this plate was inoculated into 
20ml of LB/Amp (100 ng/ml) liquid medium and grown overnight at 37°C with shaking. 

20 The overnight culture was diluted 1:30 into 1.0 L LB/Amp (100 pg/ml) liquid medium*and 
allowed to grow at the optimal temperature (30 or 37°C) until the OD 550 reached 0.6-0.8. 
Expression of recombinant protein was induced by addition of IPTG (final concentration 
LOmM) and the culture incubated for a further 3 hours. Bacteria were harvested by 
centrifugation at 8000g for 15 min at 4°C. The bacterial pellet was resuspended in 7.5 ml of 

25 either (i) cold buffer A (300 mM NaCl, 50 mM phosphate buffer, 10 mM imidazole, pH 8.0) 
for soluble proteins or (ii) buffer B (lOmM Tris-HCl, 100 mM phosphate buffer, pH 8.8 and, 
optionally, 8M urea) for insoluble proteins. Proteins purified in a soluble form included 
287-His, Al, A2, A3 and A4287-His, A4287MC58-His, 287c-His and 287cMC58-His. 
Protein 287bMC58-His was insoluble and purified accordingly. Cells were disrupted by 

30 sonication on ice four times for 30 sec at 40W using a Branson sonifier 450 and centrifiiged 
at 13000xg for 30 min at 4°C For insoluble proteins, pellets were resuspended in 2.0 ml 
buffer C (6 M guanidine hydrochloride, 100 mM phosphate buffer, 10 mM Tris- HC1, pH 7.5 
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and treated with 10 passes of a Dounce homogenizes The homogenate was centrifuged at 
13000g for 30 min and the supernatant retained. Supematants for both soluble and insoluble 
preparations were mixed with \50\x\ Ni 2+ -resin (previously equilibrated with either buffer A 
or buffer B, as appropriate) and incubated at room temperature with gentle agitation for 30 
5 min. The resin was Chelating Sepharose Fast Flow (Pharmacia), prepared according to the 
manufacturer's protocol. The batch-wise preparation was centrifuged at 700g for 5 min at 
4°C and the supernatant discarded. The resin was washed twice (batch-wise) with 10ml 
buffer A or B for 10 min, resuspended in 1.0 ml buffer A or B and loaded onto a disposable 
column. The resin continued to be washed with either (i) buffer A at 4°C or (ii) buffer B at 

10 room temperature, until the OD 2 go of the flow-through reached 0.02-0.01. The resin was 
further washed with either (i) cold buffer C (300mM NaCl, 50mM phosphate buffer, 20mM 
imidazole, pH 8.0) or (ii) buffer D (lOmM Tris-HCl, lOOmM phosphate buffer, pH 6.3 and, 
optionally, 8M urea) until OD 2 go of the flow-through reached 0.02-0.01. The His-fusion 
protein was eluted by addition of 700jil of either (i) cold elution buffer A (300 mM NaCl, 

15 50mM phosphate buffer, 250 mM imidazole, pH 8.0) or (ii) elution buffer B (10 mM 
Tris-HCl, 100 mM phosphate buffer, pH 4.5 and, optionally, 8M urea) and fractions 
collected until the OD 2 so indicated all the recombinant protein was obtained. 20fil aliquots of 
each elution fraction were analysed by SDS-PAGE. Protein concentrations were estimated 
using the Bradford assay. 

20 Renaturation of denatured His-fusion proteins. 

Denaturation was required to solubilize 287bMC8, so a renaturation step was employed prior 
to immunisation. Glycerol was added to the denatured fractions obtained above to give a 
final concentration of 10% v/v. The proteins were diluted to 200 |ig/ml using dialysis buffer 
I (10% v/v glycerol, 0.5M arginine, 50 mM phosphate buffer, 5.0 mM reduced glutathione, 
25 0.5 mM oxidised glutathione, 2.0M urea, pH 8.8) and dialysed against the same buffer for 
12-14 hours at 4°C. Further dialysis was performed with buffer II (10% v/v glycerol, 0.5M 
arginine, 50mM phosphate buffer, 5.0mM reduced glutathione, 0.5mM oxidised glutathione, 
pH 8.8) for 12-14 hours at 4°C. Protein concentration was estimated using the formula: 
Protein (mg/ml) = (1.55 x OD 28 o) - (0.76xOD 26 o) 
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Amino acid sequence analysis. 

Automated sequence analysis of the NH 2 -terminus of proteins was performed on a Beckman 
sequencer (LF 3000) equipped with an on-line phenylthiohydantoin-amino acid analyser 
(System Gold) according to the manufacturer's recommendations. 

5 Immunization 

Balb/C mice were immunized with antigens on days 0, 21 and 35 and sera analyzed at day 49. 
Sera analysis - ELISA 

The acapsulated MenB M7 and the capsulated strains were plated on chocolate agar plates 
and incubated overnight at 37°C with 5% C0 2 . Bacterial colonies were collected from the 

10 agar plates using a sterile dracon swab and inoculated into Mueller-Hinton Broth (Difco) 
containing 0.25% glucose. Bacterial growth was monitored every 30 minutes by following 
OD 6 2o. The bacteria were let to grow until the OD reached the value of 0.4-0.5. The culture 
was centrifuged for 10 minutes at 4000rpm. The supernatant was discarded and bacteria 
were washed twice with PBS, resuspended in PBS containing 0.025% formaldehyde, and 

15 incubated for 1 hour at 37°C and then overnight at 4°C with stirring. lOOul bacterial cells 
were added to each well of a 96 well Greiner plate and incubated overnight at 4°C. The wells 
were then washed three times with PBT washing buffer (0.1% Tween-20 in PBS). 200ul of 
saturation buffer (2.7% polyvinylpyrrolidone 10 in water) was added to each well and the 
plates incubated for 2 hours at 37°C. Wells were washed three times with PBT. 200ul of 

20 diluted sera (Dilution buffer: 1% BSA, 0.1% Tween-20, 0.1% NaN 3 in PBS) were added to 
each well and the plates incubated for 2 hours at 37°C. Wells were washed three times with 
PBT. lOOul of HRP-conjugated rabbit anti-mouse (Dako) serum diluted 1:2000 in dilution 
buffer were added to each well and the plates were incubated for 90 minutes at 37°C. Wells 
were washed three times with PBT buffer. lOOul of substrate buffer for HRP (25ml of citrate 

25 buffer pH5, lOmg of O-phenildiamine and lOul of H 2 0 2 ) were added to each well and the 
plates were left at room temperature for 20 minutes. lOOul 12.5% H 2 S0 4 was added to each 
well and OD490 was followed. The ELISA titers were calculated abitrarely as the dilution of 
sera which gave an OD490 value of 0.4 above the level of preimmune sera. The ELISA was 
considered positive when the dilution of sera with OD490 of 0.4 was higher than 1:400. 

30 Sera analysis - FACS Scan bacteria binding assay 

The acapsulated MenB M7 strain was plated on chocolate agar plates and incubated 
overnight at 37°C with 5% C0 2 . Bacterial colonies were collected from the agar plates using 
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a sterile dracon swab and inoculated into 4 tubes containing 8ml each Mueller-Hinton Broth 
(Difco) containing 0.25% glucose. Bacterial growth was monitored every 30 minutes by 
following OD620- The bacteria were let to grow until the OD reached the value of 035-0.5. 
The culture was centrifuged for 10 minutes at 4000rpm. The supernatant was discarded and 
5 the pellet was resuspended in blocking buffer (1% BSA in PBS, 0.4% NaN 3 ) and centrifuged 
for 5 minutes at 4000rpm. Cells were resuspended in blocking buffer to reach OD 62 o of 0.05. 
lOOfil bacterial cells were added to each well of a Costar 96 well plate. lOOpl of diluted 
(1:100, 1:200, 1:400) sera (in blocking buffer) were added to each well and plates incubated 
for 2 hours at 4°C. Cells were centrifuged for 5 minutes at 4000rpm, the supernatant 

10 aspirated and cells washed by addition of 200|il/well of blocking buffer in each well. 100|il 
of R-Phicoerytrin conjugated F(ab) 2 goat anti-mouse, diluted 1:100, was added to each well 
and plates incubated for 1 hour at 4°C. Cells were spun down by centrifugation at 4000rpm 
for 5 minutes and washed by addition of 200|il/well of blocking buffer. The supernatant was 
aspirated and cells resuspended in 20Q\iUv/e\\ of PBS, 0.25% formaldehyde. Samples were 

15 transferred to FACScan tubes and read. The condition for FACScan (Laser Power 15mW) 
setting were: FL2 on; FSC-H threshold:92; FSC PMT Voltage: E 01; SSC PMT: 474; Amp. 
Gains 6.1; FL-2 PMT: 586; compensation values: 0. 

Sera analysis - bactericidal assay 

N. meningitidis strain 2996 was grown overnight at 37°C on chocolate agar plates (starting 
20 from a frozen stock) with 5% C0 2 . Colonies were collected and used to inoculate 7ml 
Mueller-Hinton broth, containing 0.25% glucose to reach an OD620 of 0.05-0.08. The culture 
was incubated for approximately 1.5 hours at 37 degrees with shacking until the OD620 
reached the value of 0.23-0.24. Bacteria were diluted in 50mM Phosphate buffer pH 7.2 
containing lOmM MgCl 2 , lOmM CaCl 2 and 0.5% (w/v) BSA (assay buffer) at the working 
25 dilution of 10 5 CFU/ml. The total volume of the final reaction mixture was 50 |il with 25 jal 
of serial two fold dilution of test serum, 12.5 \x\ of bacteria at the working dilution, 12.5 |il of 
baby rabbit complement (final concentration 25% ). 

Controls included bacteria incubated with complement serum, immune sera incubated with 
bacteria and with complement inactivated by heating at 56°C for 30\ Immediately after the 
30 addition of the baby rabbit complement, 10^x1 of the controls were plated on Mueller-Hinton 
agar plates using the tilt method (time 0). The 96-wells plate was incubated for 1 hour at 
37°C with rotation. 7\il of each sample were plated on Mueller-Hinton agar plates as spots, 
whereas IOjliI of the controls were plated on Mueller-Hinton agar plates using the tilt method 
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(time 1). Agar plates were incubated for 18 hours at 37 degrees and the colonies 
corresponding to time 0 and time 1 were counted. 

Sera analysis - western blots 

Purified proteins (500ng/lane), outer membrane vesicles (5pg) and total cell extracts (25|ig) 
5 derived from MenB strain 2996 were loaded onto a 12% SDS-polyacrylamide gel and 
transferred to a nitrocellulose membrane. The transfer was performed for 2 hours at 150mA 
at 4°C, using transfer buffer (03% Tris base, 1.44% glycine, 20% (v/v) methanol). The 
membrane was saturated by overnight incubation at 4°C in saturation buffer (10% skimmed 
milk, 0.1% Triton X100 in PBS). The membrane was washed twice with washing buffer (3% 
10 skimmed milk, 0.1% Triton X100 in PBS) and incubated for 2 hours at 37°C with mice sera 
diluted 1:200 in washing buffer. The membrane was washed twice and incubated for 90 
minutes with a 1:2000 dilution of horseradish peroxidase labelled anti-mouse Ig. The 
membrane was washed twice with 0.1% Triton X100 in PBS and developed with the Opti- 
4CN Substrate Kit (Bio-Rad). The reaction was stopped by adding water. 

15 The OMVs were prepared as follows: N. meningitidis strain 2996 was grown overnight at 37 
degrees with 5% C0 2 on 5 GC plates, harvested with a loop and resuspended in 10 ml of 
20mM Tris-HCl pH 7.5, 2 mM EDTA. Heat inactivation was performed at 56°C for 45 
minutes and the bacteria disrupted by sonication for 5 minutes on ice (50% duty cycle, 50% 
output , Branson sonifier 3 mm microtip). Unbroken cells were removed by centrifugation at 

20 5000g for 10 minutes, the supernatant containing the total cell envelope fraction recovered 
and further centrifuged overnight at 50000g at the temperature of 4°C . The pellet containing 
the membranes was resuspended in 2% sarkosyl, 20mM Tris-HCl pH 7.5, 2 mM EDTA and 
incubated at room temperature for 20 minutes to solubilise the inner membrands; The 
suspension was centrifuged at lOOOOg for 10 minutes to remove aggregates, the supernatant 

25 was further centrifuged at 50000g for 3 hours. The pellet, containing the outer membranes 
was washed in PBS and resuspended in the same buffer. Protein concentration was measured 
by the D.C. Bio-Rad Protein assay (Modified Lowry method), using BSA as a standard. 

Total cell extracts were prepared as follows: N. meningitidis strain 2996 was grown 
overnight on a GC plate, harvested with a loop and resuspended in 1ml of 20mM Tris-HCl. 
30 Heat inactivation was performed at 56°C for 30 minutes. 

961 domain studies 

Cellular fractions preparation Total lysate, periplasm, supernatant and OMV of Kcoli clones 
expressing different domains of 961 were prepared using bacteria from over-night cultures or 
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after 3 hours induction with IPTG. Briefly, the periplasm were obtained suspending bacteria 
in saccarose 25% and Tris 50mM (pH 8) with polimixine lOOug/ml. After lhr at room 
temperature bacteria were centrifuged at BOOOrpm for 15 min and the supernatant were 
collected. The culture supernatant were filtered with 0.2um and precipitated with TCA 50% 
5 in ice for two hours. After centrifugation (30 min at 13000 rp) pellets were rinsed twice with 
ethanol 70% and suspended in PBS. The OMV preparation was performed as previously 
described. Each cellular fraction were analyzed in SDS-PAGE or in Western Blot using the 
polyclonal anti-serum raised against GST-961. 

Adhesion assay Chang epithelial cells (Wong-Kilbourne derivative, clone l-5c-4, human 
10 conjunctiva) were maintained in DMEM (Gibco) supplemented with 10% heat-inactivated 
FCS, 15mM L-glutamine and antibiotics. 

For the adherence assay, sub-confluent culture of Chang epithelial cells were rinsed with 
PBS and treated with trypsin-EDTA (Gibco), to release them from the plastic support. The 
cells were then suspended in PBS, counted and dilute in PBS to 5x10 s cells/ml. 

15 Bacteria from over-night cultures or after induction with IPTG, were pelleted and washed 
twice with PBS by centrifuging at 13000 for 5 min. Approximately 2-3xl0 8 (cfu) were 
incubated with 0.5 mg/ml FTTC (Sigma) in 1ml buffer containing 50mM NaHC0 3 and 
lOOmM NaCl pH 8, for 30 min at room temperature in the dark. FTTC-labeled bacteria were 
wash 2-3 times and suspended in PBS at l-1.5xl0 9 /ml. 200ul of this suspension (2-3xl0 8 ) 

20 were incubated with 200ul (lxl 0 s ) epithelial cells for 30min a 37°C. Cells were than 
centrifuged at 2000rpm for 5 min to remove non-adherent bacteria, suspended in 200ul of 
PBS, transferred to FACScan tubes and read 
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CLAIMS 

L A method for the heterologous expression of a protein of the invention, in which (a) at 
least one domain in the protein is deleted and, optionally, (b) no fusion partner is used. 

2. The method of claim 1, in which the protein of the invention is ORF46. 

5 3. The method of claim 2, in which ORF46 is divided into a first domain (amino acids 
1-433) and a second domain (amino acids 433-608). 

4. The method of claim 2, in which the protein of the invention is 564. 

5. The method of claim 4, in which protein 564 is divided into domains as shown in Figure 
8. 

10 6. The method of claim 1 in which the protein of the invention is 96 1 . 

7. The method of claim 6, in which protein 961 is divided into domains as shown in Figure 
12. 

8. The method of claim 1, in which the protein of the invention is 502 and the domain is 

*—'_ 

amino acids 28 to 167 (numbered according to the MC58 sequence). 
15 9. The method of claim 1, in which the protein of the invention is 287. ~ , 

10. A method for the heterologous expression of a protein of the invention, in which (a) a 
portion of the N-terminal domain of the protein is deleted. 

1 1. The method of claim 9 or claim 10, in which protein 287 is divided into domains A B & 
C shown in Figure 5. 

20 12. The method of claim 1 1, in which (i) domain A, (ii) domains A and B, or (iii) domains A 
and C are deleted. 

13. The method of claim 11, wherein (i) amino acids 1-17, (ii) amino acids 1-25, (iii) amino 
acids 1-69, or (iv) amino acids 1-106, of domain A are deleted. 

14. A method for the heterologous expression of a protein of the invention, in which (a) no 
25 fusion partner is used, and (b) the protein's native leader peptide (if present) is used. 
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15. The method of claim 14, in which the protein of the invention is selected from the group 
consisting of: 111, 149, 206, 225-1, 235, 247-1, 274, 283, 286, 292, 401, 406, 502-1, 
503, 519-1, 525-1, 552, 556, 557, 570, 576-1, 580, 583, 664, 759, 907, 913, 920-1, 936- 
1, 953, 961, 983, 989, Orf4, Orf7-l, Orf9-l, Orf23, Orf25, Orf37, Orf38, Orf40, Orf40.1, 

5 (M40.2, Orf72-l, Orf76-l, Orf85-2, Orf91, Orf97-l, Orfll9, Orfl43.1, NMB0109, 
NMB2050, 008, 105, 117-1, 121-1, 122-1, 128-1, 148, 216, 243, 308, 593, 652, 726, 
926, 982, Orf83-l and Orfl43-l. 

16. A method for the heterologous expression of a protein of the invention, in which (a) the 
protein's leader peptide is replaced by the leader peptide from a different protein and, 

10 optionally, (b) no fusion partner is used. 

17. The method of claim 16, in which the different protein is 961, ORF4, Kcoli OmpA, or 
Exarotovora PelB, or in which the leader peptide is MKKYLFSAA. 

18. The method of claim 17, in which the different protein is Kcoli OmpA and the protein of 
the invention is ORF1. 

15 19. The method of claim 17, in which the protein of the invention is 911 and the different 
protein is Kcarotovora PelB or E.coli OmpA. 

20. The method of claim 17, in which the different protein is ORF4 and the protein of the 
invention is 287. 

21. A method for the heterologous expression of a protein of the invention, in which (a) the 
20 protein's leader peptide is deleted and, optionally, (b) no fusion partner is used. 

22. The method of claim 21, in which the protein of the invention is 919. 

23. A method for the heterologous expression of a protein of the invention, in which 
expression of a protein of the invention is carried out at a temperature at which a toxic 
activity of the protein is not manifested. 

25 24. The method of claim 23, in which protein 919 is expressed at 30°C. 

25. A method for the heterologous expression of a protein of the invention, in which protein 
is mutated to reduce or eliminate toxic activity. 
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26. The method of claim 25, in which the protein of the invention is 907, 919 or 922. 

27. The method of claim 26, in which 907 is mutated at Glu-1 17 {e.g. Glu->Gly). 

28. The method of claim 26, in which 919 is mutated at Glu-255 (e.g. Glu-»Gly) and/or 
Glu-323 (e.g. Glu-»Gly). 

29. The method of claim 26, in which 922 is mutated at Glu-1 64 (e.g. Glu->-Gly), Ser-213 
(e.g. Ser-»Gly) and/or Asn-348 (e.g. Asn-*Gly). 

30. A method for the heterologous expression of a protein of the invention, in which vector 
pSM214 is used or vector pET-24b is used. 

31. The method of claim 30, in which the protein of the invention is 953 and the vector is 
pSM214. 

32. A method for the heterologous expression of a protein of the invention, in which a 
protein is expressed or purified such that it adopts a particular multimeric form. 

33. The method of claim 32, in which protein 953 is expressed and/or purified in monomeric 
form. 

34. The method of claim 32, in which protein 961 is expressed and/or purified in tetrameric 
form. 

35. The method of claim 32, in which protein 287 is expressed and/or purified in dimeric 
form. 

36. The method of claim 32, in which protein 919 is expressed and/or purified in monomeric 
form. 

37. A method for the heterologous expression of a protein of the invention, in which the 
protein is expressed as a lipidated protein. 

38. The method of claim 37, in which the protein of the invention is 919, 287, ORF4, 406, 
576,orORF25. 

39. A method for the heterologous expression of a protein of the invention, in which (a) the 
protein's C-terminus region is mutated and, optionally, (b) no fusion partner is used. 
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40. The method of claim 39, wherein the mutation is a substitution, an insertion, or a deletion 

41. The method of claim 40, wherein the protein of the invention is 730, ORF29 or ORF46. 

42. A method for the heterologous expression of a protein of the invention, in which the 
protein's leader peptide is mutated. 

5 43. The method of claim 42, in which the protein of the invention is 919. 

44. A method for the heterologous expression of a protein, in which a poly-glycine stretch 
within the protein is mutated. 

45. The method of claim 44, wherein the protein is a protein of the invention. 

46. The method of claim 45, wherein the protein of the invention is 287, 741, 983 or Tbp2. 
10 47. The method of claim 46, wherein (Gly) 6 is deleted from 287 or 983. 

48. The method of claim 46, wherein (Gly) 4 is deleted from Tbp2 or 741 

49. The method of claim 47 or claim 48, wherein the leader peptide is also deleted. 

50. The method of any preceding claim, in which the heterologous expression is in an E.coli 
host. 

15 5 1 . A protein expressed by the method of any preceding claim. 

52. A heterologous protein comprising the N-terminal amino acid sequence MKKYLFSAA. 
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FIGURE 8 
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FIGURE 14R — 961 ORF 46.1 
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